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PITX2 - a marker to predict survival of patients diagnosed with breast cell proliferative 

disease 



The present invention relates to methods for predicting the sxirvivai of a human being diag- 
nosed with a cell proliferative disorder of the breast tissues, characterized by a step of deter- 
mming the expression level of PITX2 or the genetic or the epigenetic modifications of the 
genomic DNA associated With the gene PITX2. The mvention also relates to sequences, oli- 
gonucleotides and antibodies which can be used within the described methods- 



Field of the Invention 



BREAST CANCER SURVIVAL 

In European and American women, breast cancer is the most fi"equently diagnosed cancer and 
the second leading cause of cancer death. In women aged 40-55, breast cancer is the leading 
cause of death (Greenlee et aL^ 2000). In 2002, there were 204,000 new cases of breast cancer 
in the US and a comparable number in Europe. 

Breast cancer is defined as the uncontrolled proliferation of cells within breast tissues. Breasts 
are comprised of 15 to 20 lobes joined together by ducts. Cancer arises most commonly in the 
duct, but is also foimd in the lobes with the rarest type of cancer termed inflammatory breast 
cancer. It will be appreciated by those skilled in the art that there exists a continuing need to 
improve methods of early detection, classification and treatment of breast cancers. In contrast 
to the detection of some other common cancers such as cervical and dermal there are inherent 
difficulties in classifying and detecting breast cancers. 

The first step of any treatment is the assessment of the patient's condition comparative to de- 
fined classifications of the disease. However the value of such a system is inherently depend- 
ent upon the quality of the classification. Breast cancers are staged according to their size, 
location and occurrence of metastasis. Methods of treatment include the use of surgery, radia- 
tion therapy, chemotherapy and endocrine therapy, which are also used as adjuvant therapies 
to surgery. In general, more aggressive disease should be treated with more aggressive thera- 
pies. 



Although the vast majority of early cancers are operable, i.e» the tumor can be completely 
removed by svirgery, about one third of the patients with lymph-node negative diseases and 
about 50-60% of patients with node-positive disease will develop metastases during follow- 
up. 

Based on this observation, systemic adjuvant treatment has been introduced for both node- 
positive and node-negative breast cancers. Systemic adjuvant therapy is administered after 
surgical removal of the tiunour, and has been shown to reduce the risk of recurrence signifi- 
cantly. Several types of adjuvant treatment are available: endocrine treatment, sdso called 
hormone treatment (for hormone receptor positive tumours), different chemotherapy regi- 
mens, and antibody treatments based on novel agents like Herceptin (an antibody to an epi- 
dermal growth factor receptor). 

The growth of the majority of breast cancers (app. 70-80%) is dependent on the presence of 
estrogen. Therefore, one important target for adjuvant therapy is the removal of estrogen (e.g. 
by ovarian ablation) or the blocking of its synthesis or the blocking of its actions on the tumor 
cells either by blocking the receptor with competing substances (e.g. Tamoxifen) or by inhib- 
iting the conversion of androgen into estrogen (e.g. aromatase inhibitors). This type of treat- 
ment is called "endocrine treatment". Endocrine treatment is thought to be efiScient only in 
tumors that express hormone receptors (the estrogen receptor (ER) and/or the progesterone 
receptor (PR)). Cimrently, the vast majority of women with hormone receptor positive breast 
cancer receive some form of endocrine treatment, independent of their nodal status. The most 
frequently used dmg in this scenario is Tamoxifen. 

However, even in hormone receptor positive patients, not all patients benefit from endocrine 
treatment. Adjuvant endocrine therapy reduces mortality rates by 22% while response rates to 
endocrine treatment in the metastatic (advanced) setting are 50 to 60%. 

Since Tamoxifen has relatively few side effects, treatment may be justified even for patients 
with low likelihood of benefit However, these patients may require additional, more aggres- 
sive adjuvant treatment. Even in earliest and least aggressive tumours, such as node-negative, 
hormone receptor positive tumours, about 21% of patients relapse within 10 years after initial! 
3iagnosrs"ir they^^^^^^ treatment (Lancet 7998 j 
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May 16;351(9114):1451-67. Tamoxifen for early breast cancer: an overview of the random- 
ised trials. Early Breast Cancer Trialists* Collaborative Group.)- Similarly, some patients with 
hormone receptor negative disease may be treated sufficiently with surgery and potentially 
radiotherapy alone, whereas others may require additional chemotherapy. 

Several cytotoxic regimens have shovm to be effective in reducing the risk of relapse in breast 
cancer (Mansour et aL^ 1998). According to current treatment guidelines, most node-positive 
patients receive adjuvant chemotherapy both in the US and Europe, since the risk of relapse is 
considerable. Nevertheless, not all patients do relapse, and there is a proportion of patients 
who would never have relapsed even without chemotherapy, but who nevertheless receive 
chemotherapy due to the currently used criteria. In hormone receptor positive patients, che- 
motherapy is usually given before endocrine treatment, whereas hormone receptor negative 
patients receive only chemotherapy. 

The situation for node-negative patients is particularly complex. In the US, c3^otoxic chemo- 
therapy is recommended for node-negative patients, if the tumour is larger than I cm. In 
Europe, chemotherapy is considered for the node-negative cases if one or more risk factors - 
such as tumour size larger than 2 cm, negative hormone receptor status, or tumour grading of 
three or age <35 is present. In general, there is a tendency to select premenopausal women for 
additional chemotherapy whereas for postmenopausal women, chemotherapy is often omitted. 
Compared to endocrine treatment, in particular Tamoxifen or aromatase inhibitors, chemo- 
therapy is highly toxic, with short-term side effects such as nausea, vomiting, bone marrow 
depression, and long-term effects such as cardiotoxicity and an increased risk for secondary 
cancers. 

It is currently not clear which breast cancer patients should be selected for more aggressive 
therapy and which would do well vsdthout additional aggressive treatment, and clinicians 
agree that there is a large need for proper selection of patients. The difficulty of selecting the 
right patients for adjuvant treatment and selecting the right adjuvant treatment, and the lack of 
suitable criteria is also reflected by a recent study which showed that chemotherapy is used 
much less frequently than recommended, based on data from the New Mexico Tumor registry 
(Du et aL, 2003). This study provided substantial evidence that there is a need for better se- 
lection of patients for chemotherapy or other, more aggressive forms of breast cancer therapy. 



This invention is related to a new biomarker, which can be used to solve the problem de- 
scribed above- Based on the observation that methylation of the gene PITX2 (also known as 
PTX2) in breast tumour tissue, obtained from the surgically removed tumour, or obtained 
from biopsy material prior to the removal, is correlated with the sxirvival time of breast cancer 
patients treated with Tamoxifen monotherapy, we invented a tool allowing a better selection 
of patients for adjuvant therapy, for example for a cytotoxic therapy (chemotherapy) (besides 
of or instead of an endocrine treatment like treatment with Tamoxifen or aromatase inhibitors) 
for breast cancer patients. 

It can be concluded that the expression levels of the protein or mRNA also correlate with said 
prognosis. Therefore, the analysis of either the expression levels of PITX2 protein, or PITX2 
mRNA or the analysis of the patient's individual genetic or epigenetic modification of the 
gene PITX2 — summarised as the analysis of expression of the gene PITX2 - may serve as a 
method for predicting the survival of a patient with breast cancer. Especially the invention 
relates to methods for predicting the survival of a patient with breast cancer who is treated 
with at least one adjuvant endocrine treatment, wherein endocrine treatment is meant to com- 
prise any treatment targeting the estrogen receptor pathway or estrogen synthesis pathway or 
estrogen conversion pathway i.e., which is involved in estrogen metabolism, production or 
secretion. 

In this context survival is meant to describe the time from diagnosis or start of treatment to an 
endpoint, which may be the time of death (considering any reason for death or only death 
from breast cancer), or the time of recurrence of breast cancer, which may be local or distant, 
or the time of occurrence of any breast cancer associated disease. Therefore "predicting the i 
survival" is meant to comprise predicting the disease free survival, as well as the overall sur- J 
vival or any other consideration of time between diagnosis and endpoint of treatment. 

PITX2 (also known as PTX2, RS, RGS, ARPl, Brxl, IDG2, IGDS, IHG2, RIEG, IGDS2, 
IRID2, Otlx2, RIEGl, MGC20144) is known to belong to the PTX subfamily of PTXl, 
PTX2, and PTX3 genes which define a novel family of transcription factors, within the j 
paired-like class of homeodomain factors. The gene PITX2 (according to NM_1 53426) en- I 
codes the paired-like homeodomain transcription factor 2, which is known to be expressed I 
during development of anterior structures such asHie-eye; teeth; and anteriarpituilBryT I 



Toyota et a/., (2001) (Blood 97: p 2823-9.) found hypermethylation of the PITX2 gene in a 
large proportion of acute myeloid leukemias. Furthermore, in this study hypermethylation of 
PITX2 is positively correlated to methylation of the ER gene and to a reduced expression 
level. Means to analyse the methylation pattern of the PITX2 gene have been described in a 
number of patent applications, too (WO 02/077272 is related to the use of methylation mark- 
ers to differentiate between AML and ALL, WO 01/19845 is related to several differentially 
methylated sequences useful for diagnosis of several cell proliferative disorders, WO 
02/00927 and WO 01/092565 are related to the use of methylation markers to diagnose dis- 
eases associated with development genes or associated with DNA transcription, respectively. 

Although the methylation of PITX2 has been associated with development, transcription and 
disease such as cancer, it has no heretofore recognised role in the outcome prediction of breast 
cancer patients or responsiveness to endocrine treatment. 

EXPRESSION ANALYSIS 

The expression of a gene, or rather the protein encoded by the gene, can be studied on four 
different levels: firstly, protein expression levels can be determined directly, secondly, mRNA 
transcription levels can be determined, thirdly, epigenetic modifications, such as gene's DNA 
methylation profile or the gene's histone profile; can be analysed, as methylation is often cor- 
related with inhibited protein expression, and fourth, the gene itself may be analysed for ge- 
netic modifications such as mutations, deletions, polymorphisms etc. influencing the expres- 
sion of the gene product. 

The levels of observation that have been studied by the methodological developments of re- 
cent years in molecular biology, are the genes themselves, the transcription of these genes into 
RNA, and the translation into the resulting proteins. However how the activation and inhibi- 
tion of specific genes, in specific cells and tissues, at specific time points in the course of de- 
velopment of an individual are controlled, is correlatable to the degree and character of the 
methylation of the genes or respectively the genome. In this respect, pathogenic conditions 
may manifest themselves in a changed methylation pattem of individual genes or of the ge- 
nome. 
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The four terms that apply to the fields of overall genome-wide analysis of all these biological 
processes are called: Proteomics, Transcriptomics, Epigenomics (or Methylomics) and Ge- 
nomics. Methods and techniques that can be used for studying expression or studying the 
modifications responsible for expression on all of these levels are well described in the lit- 
erature and therefore known to a person skilled in the art. They are described in text books of 
molecular biology and in a large number of scientific journals. 

How to analyse the protein expression of a single gene is prior art. It usually requires an anti- 
body specific for the gene product of interest. Appropriate technologies would be ELISA or 
Immunohistochemistry. 

The analysis of the level of mRNA also has been described sufficiently. These days the gold 
standard is the reverse transcriptase PGR. 

To avoid duplication a more detailed description of the prior art relating to existing and well 
known technologies is given within the description of the invention, as it is part of the inven- 
tion. 

US patent application 2003/0198970 by Gareth Roberts lists some of the technologies and 
methods on how to determine a person*s "genetic make up", i.e. the genetic modifications, 
such as deletions, poljrmoiphisms, mutations etc. that may vary between individuals and de- 
scribes the potential role of this genetic sequence information in the individual's variability in 
disease, response to therapy and prognosis. Epigenetic differences however are not men- 
tioned. The gene PITX2 is listed within this application as one gene name out of a long and 
comprehensive list of about 2.500 other gene names, suggesting its expression could play a 
role in some kind of treatment response. However, this is simply an assimiption based on 
speculation only, as no experiments are disclosed, which demonstrate any kind of relation 
between genetic modifications of PITX2 and an individuars variation in treatment response. 

A less established area in this context is the field of epigenomics or epigenetics, i.e. the field 
concerned with analysis of DNA methylation patterns. 

Methylation of DNA can play an important role in the control of gene expression in mam- 
mahan c^K^^^^^ 
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transfer of a methyl group from S-adenosylmethionine to cytosine residues to form 5- 
methylcytosine, a modified base that is found mostly at CpG sites in the genome. The pres- 
ence of methylated CpG islands in the promoter region of genes can suppress their expres- 
sion. This process may be due to the presence of 5-methylcytosine, which apparently inter- 
feres with the binding of transcription factors or other DNA-binding proteins to block tran- 
scription. In different types of tumours, aberrant or accidental methylation of CpG islands in 
the promoter region has been observed for many cancer-related genes, resulting in the silenc- 
ing of their expression. Such genes include tumour suppressor genes, genes that suppress 
metastasis and angiogenesis, and genes that repair DNA (Momparler and Bovenzi (2000) J. 
Cell Physiol. 183:145-54). 

In addition it has been described that DNA methylation may also play a role in the field of 
pharmacogenetics. A similar approach on how to apply information about genetic modifica- 
tions of the genome to the analysis of individual responses to treatment as was for example 
described by Gareth Roberts in US application 2003/0198970 was already subject of the ap~ 
plication WO 02/037398, tailored to the application of information about epigenetic modifi- 
cations of the genome, based on DNA methylation analysis, to guide treatment selection and 
to study individual's treatment responses. 

An example for the applicability of this idea was given by Esteller et al. (Esteller et al. (2000) 
N Engl J Med. 2000 Nov 9;343(1 9): 1350-4.), who demonstrated that methylation of the 
MGMT promoter in gliomas is a useful predictor of the responsiveness of the tumours to al- 
kylating agents. More recently, Fnihwald has summarised a series of studies demonstrating 
that DNA methylation is associated with the aggressiveness of different cancers (Fruhwald 
MC. DNA methylation patterns in cancer: novel prognostic indicators? Am J Pharmacoge- 
nomics. 2003;3(4):245-60). 

An example for the potential of analysis of epigenetic modifications, such as DNA methyla- 
tion analysis, for the prediction of treatment response - related to breast cancer- was presented 
as a poster by Martens et al. at the San Antonio Breast Cancer Symposium, San Antonio, TX, 
December, 3-6, 2003. Breast cancer patients which have had their tumours removed by sur- 
gery and developed metastases at some point after the removal, were treated with Tamoxifen, 
an endocrine treatment drug. The primary tumour samples were analysed for aberrant meth- 
ylation patterns. The patients were then divided into two sub classes according to their objec- 
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tive tumour response: patients with progressive disease (which could be described as increas- 
ing metastasis size) and patients with complete or partial remission of the relapsed tumour 
(which could be described as decreasing metastasis size). It turned out, that those patients 
which had a tumour removed and experienced a remission (decrease in size) of the metastasis 
under endocrine treatment, had suffered from a tumour which showed a distinct pattern of 
DNA methylation at specific CpG sites, whereas patients which show progressive disease (did 
not experience a decrease but an increase in size of their metastases), under endocrine treat- 
ment, suffered from a tumour which did not show this distinct pattern of DNA methylation 
(but a different pattem) at these CpG sites. This is a clear indication, that the methylation 
pattern described in that study can serve as a predictive treatment response tool for an endo- 
crine treatment, like tamoxifen. The results of this study, i.e. predictive biomarkers and assays 
therefore, are subject of patent application WO 04/035803, published at April 29, 2004: 
Method and nucleic acid for the improved treatment of breast cell proliferative disorders. Pre- 
dictive markers as described above will also be called 'metastatic' markers in the context of 
this application: PITX2 is also listed as a predictive marker in said application. 

Currently several predictive markers are under evaluation. As up to now most patients have 
received Tamoxifen as endocrine treatment most of the markers have been shown to be asso- 
ciated vsdth response or resistance to tamoxifen. However, it is generally asstuned that there is 
a large overlap between responders to one or the other endocrine treatment. In fact, ER and 
PR expression are used to select patients for any endocrine treatment. Among the markers 
which have been associated with tamoxifen response is bcl-2. High bcl-2 expression levels 
showed promising correlation to tamoxifen therapy response in patients with metastatic dis- 
ease and prolonged survival and added valuable information to an ER negative patient sub- 
group (J Clin Oncology, 1997, 15 5: 1916-1922; Endocrme, 2000, 13(1):1-10). There is con- 
flicting evidence regarding the independent predictive value of c-erbB2 (Her2/neu) overex- 
pression in patients with advanced breast cancer that require further evaluation and verifica- 
tion (British J of Cancer, 1999, 79 (7/8): 1220-1226; J Natl Cancer Inst, 1998, 90 (21): 1601- 
1608). 

Other predictive markers include SRC-1 (steroid receptor coactivator-1), CGA mRNA over 
expression, cell kinetics and S phase fraction assays (Breast Cancer Res and Treat, 1998, 
48:87-92; Oncogene, 2001, 20:6955-6959). Recently, uPA (Urokinase-type plasminogen acti- 
Wtof)~and~PAI-T~XI^la5^ "type T)"Kgether showed To" W usefH'to 



define a subgroup of patients who have worse prognosis and who would benefit from adju- 
vant systemic therapy (J Clinical Oncology, 2002, 20 n^ 4). However, all of these markers 
need further evaluations in prospective trials as none of them is yet a validated marker of re- 
sponse. 

Also recently published was a study related to the prognostic power of methylation analysis 
in breast cancer patients. Muller et al. (Muller HM, Widschwendter A, Fiegl H, Ivarsson L, 
Goebel G, Perkmann E, Marth C, Widschwendter M. (2003) DNA methylation in serum of 
breast cancer patients: an independent prognostic marker. Cancer Res. 2003 Nov 15; 63(22): 
7641-5.) reported about a set of genes, which can be used as biomarkers in patient pre- 
therapeutic sera for the prognosis of breast cancer. Specific aberrant methylation patterns of 
two genes found in DNA from pretreatment serum of cancer patients indicated whether their 
prognosis was good or bad. The DNA analysed was not tumour DNA but serum DNA. Most 
likely the presence of a tumour-specific pattern indicates that tumour derived DNA is present, 
however, the absence of a specific methylation pattern, may be due to a tumour which does 
not show this methylation pattern, or a tumour which does not shed sufficient DNA into the 
blood stream. Good or bad prognosis was defined as long or short "overall survival" after 
surgery, without adjuvant treatment. This result therefore relates to untreated patients, only. 

These 'prognostic' markers are able to answer the question whether or not a breast cancer 
patient should get an aggressive adjuvant treatment like chemotherapy after removal of the 
tumoiir to avoid recurrence of cancer, i.e. occurrence of metastases. 

However, none of these study results and none of these markers is able to answer the specific 
question raised above, whether or not a breast cancer patient should get adjuvant chemother- 
apy after removal of the tumour to avoid recurrence of cancer, i.e, occurrence of metastases in 
addition or alternative to endocrine treatment (with a drug like tamoxifen, or aromatase in- 
hibitors). 

A marker for a bad prognosis for cancer patients (without treatment), might not be applicable 
to a patient under adjuvant treatment with a drug like tamoxifen. Therefore the test would not 
be able to help deciding, whether chemotherapy, including all its side afiTects and inherent 
risks, is necessary or whether endocrine treatment is sufficient, because an endocrine treat- 
ment might change the prognosis from "bad'' to "good". 
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The predictive 'metastatic' marker set described above, would be able to identify amongst all 
patients which relapsed (developed metastases after surgery) those patients, which do not re- 
spond to endocrine treatment (by partial or complete remission of relapsed tumour). These 
markers however, caimot be applied to answer the question whether metastases will occur at 
all (after surgery of the primary tumour under endocrine treatment), and consequently whether 
it is advised to give adjuvant chemotherapy to avoid recurrence of cancer (i.e. relapse or oc- 
currence of metastases). 

In one aspect the present invention provides a marker, PITX2 (which shall be recognised as 
the gene encoding for the protein PITX2; according to NM_1 53426), that can be used to an- 
swer that question and help guiding the decision whether or not an adjuvant chemotoxic ther- 
apy shall be subscribed in addition or instead of treatment with endocrines, like tamoxifen. A 
marker able to answer this question will also be called 'adjuvant' marker, in the context of 
this application. 

In addition study results presented by Paik et al. at the San Antonio Breast Cancer Sympo- 
siirai, San Antonio, TX, December, 3-6, 2003 provide an answer to this question, by analysing 
the mRNA expression pattem of 16 genes plus 5 controls with RT-PCR. 

The provided invention however has the advantage that looking at only one gene or a small 
selection of three to five genes will give sufficient information for a validated prognosis. 

For demonstration : The 'metastatic' test (use of a 'metastatic' marker) tells a patient whether 
she is unlikely to respond to endocrine treatment when she develops metastases. But she does 
not know how high the likelihood is, that she will experience a relapse at all. The 'prognostic' 
test (use of a 'prognostic' marker) tells a patient whether she will have a good or bad progno- 
sis without any treatment. Even wdth a "bad prognosis" endocrine treatment might be enough 
though. The prognostic markers are not necessarily able to predict the outcome xmder endo- 
crine treatment. The 'adjuvant test' (use of an 'adjuvant' marker) tells her whether she will or 
will not develop recurrence, without chemotherapy, even when treated with the standard -low 
side effected- endocrine treatment. 
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This invention relates to the use of PITX2, as an 'adjuvant marker', which also serves as a 
'prognostic marker', especially in hormone receptor negative women, which would not get 
any endocrine treatment at all. 

5-methylcytosine is the most frequent covalent base modification in the DNA of eukaryotic 
cells. It plays a role, for example, in the regulation of the transcription, in genetic imprinting, 
and in tumorigenesis. Therefore, the identification of 5-methylcytosine as a component of 
genetic information is of considerable interest. However, 5-methylcytosine positions caimot 
be identified by sequencing since 5-methylcytosine has the same base pairing behaviour as 
cytosine. Moreover, the epigenetic information carried by 5-methylcytosine is completely lost 
during PGR amplification. 

A relatively new and currently the most frequently used method for analysing DNA for 5- 
methylcytosine is based upon the specific reaction of bisulfite with cytosine which, upon sub- 
sequent alkaline hydrolysis, is converted to umcil which corresponds to thymidine in its base 
pairing behaviour. However, 5-methylcytosine remains unmodified under these conditions. 
Consequently, the original DNA is converted in such a manner that methylcytosine, which 
originally could not be distinguished from cytosine by its hybridisation behaviom:, can now be 
detected as the only remaining cytosine using "normal" molecular biological techniques, for 
example, by amplification and hybridisation or sequencing. All of these techniques are based 
on base pairing which can now be fiilly exploited. In terms of sensitivity, the prior art is de- 
fined by a method which encloses the DNA to be analysed in an agarose matrix, thus pre- 
venting the diffusion and renatumtion of the DNA (bisulfite only reacts with single-stranded 
DNA), and which replaces all precipitation and purification steps with fast dialysis (Olek A, 
Oswald J, Walter J. A modified and improved method for bisulphite based cytosine methyla- 
tion analysis. Nucleic Acids Res. 1996 Dec 15;24(24):5064-6). Using this method, it is possi- 
ble to analyse individual cells, which illustrates the potential of the method. However, cur- 
rently only individual regions of a length of up to approximately 3000 base pairs are analysed, 
a global analysis of cells for thousands of possible methylation events is not possible. How- 
ever, this method cannot reliably analyse very small fragments from small sample quantities 
either. These are lost through the matrix in spite of the diffusion protection. 
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An overview of the further known methods of detecting 5-methylcytosine may be gathered 
from the following review article: Rein, T., DePamphilis, M. L., Zorbas, H., Nucleic Acids 
Res. 1998,26,2255. 

To date, barring few exceptions (e.g., Zeschnigk M, Lich C, Suiting K, Doerfler W, 
Horsthemke B. A single-tube PGR test for the diagnosis of Angelman and Prader-Willi syn- 
drome based on allelic methylation differences at the SNRPN locus. Eur J Hum Genet 1 997 
Mar-Apr;5(2):94-8) the bisulfite technique is only used in research. Always, however, short, 
specific fragments of a known gene are amplified subsequent to a bisulfite treatment and ei- 
ther completely sequenced (Olek A, Walter J. The pre-implantation ontogeny of the HI 9 
methylation imprint Nat Genet 1997 Nov;17(3):275-6) or individual cytosine positions are 
detected by a primer extension reaction (Gonzalgo ML, Jones PA. Rapid quantitation of 
methylation differences at specific sites using methylation-sensitive single nucleotide primer 
extension (Ms-SNuPE). Nucleic Acids Res. 1997 Jun 15;25(12):2529-31, WO 95/00669) or 
by enzymatic digestion (Xiong Z, Laird PW. COBRA: a sensitive and quantitative DNA 
methylation assay. Nucleic Acids Res. 1997 Jun 15;25(12):2532-4). In addition, detection by 
hybridisation has also been described (Olek et al., WO 99/28498). 

Further publications dealing with the use of the bisulfite technique for methylation detection 
in individual genes are: Grigg G, Clark S. Sequencing S-methylcj^osine residues in genomic 
DNA. Bioessays. 1994 Jun;16(6):431-6, 431; Zeschnigk M, Schmitz B, Dittrich B, Suiting K, 
Horsthemke B, Doerfler W. Imprinted segments in the human genome: different DNA meth- 
ylation patterns in the Prader-Willi/Angelman syndrome region as determined by the genomic 
sequencing method. Hum Mol Genet. 1997 Mar;6(3):3 87-95; Fell R, Charlton J, Bird AP, 
Walter J, Reik W. Methylation analysis on individual chromosomes: improved protocol for 
bisulphite genomic sequencing. Nucleic Acids Res. 1994 Feb 25;22(4);695-6; Martin V, 
Ribieras S, Song- Wang X, Rio MC, Dante R. Genomic sequencing indicates a correlation 
between DNA hypomethylation in the 5' region of the pS2 gene and its expression in human 
breast cancer cell lines. Gene. 1995 May 19;157(l-2):261-4; WO 97/46705 and WO 
95/15373. 

An overview of the Prior Art in oligomer array manufacturing can be gathered from a special 
edition of Nature Genetics (Nature Genetics Supplement, Volume 21, January 1999), pub- 
- lished-inrJantiary 1999j and from the Kterature-cited therein; 
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Fluorescently labelled probes are often used for the scanning of immobilised DNA arrays. 
The simple attachment of Cy3 and Cy5 dyes to the 5-OH of the specific probe are particularly 
suitable for fluorescence labels. The detection of the fluorescence of the hybridised probes 
may be carried out, for example via a confocal microscope. Cy3 and Cy5 dyes, besides many 
others, are conmaercially available. 

Matrix Assisted Laser Desorption Ionization Mass Spectrometry (MALDI-TOF) is a very 
efficient development for the analysis of biomolecules (Karas M, Hillenkamp F. Laser de- 
sorption ionization of proteins with molecular masses exceeding 1 0,000 daltons. Anal Chem. 
1988 Oct 15;60(20):2299-301). An analyte is embedded m a light-absorbing matrix. The ma- 
trix is evaporated by a short laser pulse thus transporting the analyte molecule into the vapour 
phase in an unfragmented manner. The analyte is ionised by collisions with matrix molecules. 
An applied voltage accelerates the ions into a field-free flight tube. Due to their different 
masses, the ions are accelerated at different rates. Smaller ions reach the detector sooner than 
bigger ones. 

MALDI-TOF spectrometry is excellently suited to the analysis of peptides and proteins. The 
analysis of nucleic acids is somewhat more difficult (Gut I G, Beck S. DNA and Matrix As- 
sisted Laser Desorption Ionization Mass Spectrometry. Current Innovations and Future 
Trends. 1995, 1; 147-57). The sensitivity to nucleic acids is approximately 100 times worse 
than to peptides and decreases disproportionally with increasing fragment size. For nucleic 
acids having a multiply negatively charged backbone, the ionisation process via the matrix is 
considerably less efficient. In MALDI-TOF spectrometry, the selection of the matrix plays an 
eminently important role. For the desorption of peptides, several very efficient matrixes have 
been found which produce a very fine crystallisation. There are now several responsive ma- 
trixes for DNA, however, the difference in sensitivity has not been reduced. The difference in 
sensitivity can be reduced by chemically modifying the DNA in such a manner that it be- 
comes more similar to a peptide. Phosphorothioate nucleic acids in which the usual phos- 
phates of the backbone are substituted with thiophosphates can be converted into a charge- 
neutral DNA using simple alkylation chemistry (Gut IG, Beck S. A procedure for selective 
DNA alkylation and detection by mass spectrometry. Nucleic Acids Res. 1995 Apr 
25;23(8): 1367-73). The coupling of a charge tag to this modified DNA results in an increase 
in sensitivity to the same level as that found for peptides. A further advantage of charge tag- 
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ging is the increased stability of the analysis against impurities which make the detection of 
unmodified substrates considerably more difficult 

Genomic DNA is obtained from DNA of cell, tissue or other test samples using standard 
methods. This standard methodology is foimd in references such as Sambrook, Fritsch and 
Maniatis eds.. Molecular Cloning: A Laboratory Manual, 1989. 

DESCRIPTION 

Characterisation of a breast cancer in terms of its predicted aggressiveness enables the physi- 
cian to make an informed decision as to a therapeutic regimen with appropriate risk and bene- 
fit trade offs to the patient. Aggressiveness is taken to mean one or more of decreased patient 
survival or disease- or relapse-free survival, increased tumour-related complications and faster 
progression of tumour or metastases. According to the aggressiveness of the disease an ap- 
propriate treatment or treatments may be selected from the group consisting of chemotherapy, 
radiothempy, surgery, biological therapy, immunotherapy, antibody treatments, treatments 
involving molecularly targeted dmgs, estrogen receptor modulator treatments, estrogen re- 
ceptor down-regulator treatments, aromatase inhibitors treatments, ovarian ablation, treat- 
ments providing LHRH analogues or other centrally acting drugs influencing estrogen pro- 
duction. Wherein a cancer is characterised as * aggressive' it is particularly preferred that a 
treatment such as, but not limited to, chemoflierapy is provided in addition to or instead of an 
endocrine targeting therapy. 

Using the methods and nucleic acids described herein, statistically significant models of pa- 
tient disease fi:ee survival or metastasis firee survival or overall survival and/or disease pro- 
gression can be developed and utilised to assist patients and clinicians in determining suitable 
treatment options to be included in the therapeutic regimen. 

In one aspect the described method is to be used to assess the utility of therapeutic regimens 
comprising one or more treatments which is either an aggressive therapy such as chemother- 
apy or a treatment which targets the estrogen receptor pathway or is involved in estrogen me- 
tabolism, production or secretion as a therapy for patients suSfering from a cell proliferative i 
disorder of the breast tissues. In particular this aspect of the method enables the physician to 
detemiinewhich' ti^^^^ oY^insfead'oTsaia^ ff ealmenf. 
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In a further aspect the described method enables the characterisation of the cell proliferative 
disorder in terms of aggressiveness, thereby enabling the physician to recommend suitable 
treatments. Thus, the present invention wiU be seen to reduce the problems associated with 
present breast cell proliferative disorder treatment response prediction methods. 

Using the methods and nucleic acids as described herein, patient survival can be evaluated 
before or during treatment for a cell proliferative disorder of the breast tissues, in order to 
provide critical information to the patient and clinician as to the likely progression of the dis- 
ease. It will be appreciated, therefore, that the methods and nucleic acids exemplified herem 
can serve to improve a patient's quality of life and odds of treatment success by aUowing both 
patient and clinician a more accurate assessment of the patient's treatment options. 

The method according to the definition may be used for the improved treatment of all breast 
cell proliferative disorder patients, both pre and post menopausal and independent of their 
node or estrogen receptor status. However, it is particularly preferred that said patients are 
node-negative and estrogen receptor positive. 

The present invention makes available a method for the improved treatment and monitoring 
of breast cell proliferative disorders, by enabling the accurate prediction of a patient's survival 
without systemic therapy or with endocrine therapy comprising one or more treatments which 
target the estrogen receptor pathway or are involved in estrogen metabolism, production, or 
secretion. 



In a particularly preferred embodiment, the method according to the invention enables the 
differentiation between patients who have a high risk of relapse under said endocrine therapy 
and those who have a low risk of relapse under said therapy. The method particularly prefera- 
bly enables the determmation of a methylation partem characteristic for a predicted survival 
time, in addition to the characterisation of tumours in terms of aggressiveness. 

The method according to the invention may be used for the analysis of a wide variety of ceU 
proliferative disorders of the breast tissues including, but not limited to, ductal carcinoma in 
situ, invasive ductal carcinoma, invasive lobular carcmoma, lobular carcmoma in situ, come- 
docarcinoma, mflammatory carcinoma, mucinous carcinoma, scirrhous carcinoma, colloid 
carcinoma, tubular carcinoma, medullary carcinoma, metaplastic carcinoma, and papillary 
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carcinoma and papillary carcinoma in situ, undifferentiated or anaplastic carcinoma and Pa- 
get's disease of the breast 

The method according to the invention is particularly suited to the prediction of survival of 
breast cancer in the following treatment setting. In one embodiment, the method is applied to 
patients who receive endocrine pathway targeting treatment as secondary treatment to an ini- 
tial non chemotherapeutical therapy, e.g. surgery Oiereinafter referred to as the adjuvant set- 
ting) as illustrated in Figure 1 . Such a treatment is often prescribed to patients suffering from 
Stage 1 to 3 breast carcmomas. In this embodiment patients survival times are predicted ac- 
cording to their gene expression or genetic or epigenetic modifications. By detecting patients 
with worse disease free survival times the physician may choose to reconmiend the patient for 
further treatment, instead of or in addition to the endocrine targeting therapy(s). La particular 
but not limited to, chemotherapy. 

This invention specifically relates to a new biomarker, PITX2, for patients diagnosed with 
breast cell proliferative disease, allowing the prediction of survival (or outcome) without 
treatment, or with different therapies, like a cytotoxic therapy (chemotherapy) in addition or 
instead of (for example in hormone receptor negative patients) an endocrine treatment, like 
treatment with Tamoxifen or aromatase inhibitors, wherein the prediction is based on the pa- 
tient's survival or clinical or pathological tumour response, or response measured with other 
surrogate parameters. 

It is also an embodiment of the invention to base the prediction of survival on a combination 
of different markers together wdth PITX2. To increase the likelihood of a correct prognosis it 
is preferred that the expression of some selected additional genes is investigated in addition to 
the analysis of PITX2 expression. It is preferred that a mini panel comprising of one or more, 
up to 7 additional genes is used. The preferred genes to be added to such a mini panel, would 
be one or more out of the group comprismg ABCA8, CDK6, ERBB2, ONECUT2, PLAU, 
TBC1D3 and TFFL Especially preferred would be a combined analysis of PITX2 with 
TBC1D3. Another preferred combination would comprise of PITX2 analysis with TBC1D3 
and CDK6 analysis. Wherever in the following the invention is described specifically for 
PITX2, it is meant to also include a preferred combination of PITX2 with one or more of the 
named genes. 
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This invention therefore relates to new methods or tools, for patients diagnosed with breast 
cell proliferative disease, allowing the evaluation of adjuvant therapy based on a prediction of 
outcome. 

More specifically this invention provides new methods or tools, for patients diagnosed with 
breast cell proliferative disease, allowing the evaluation of adjuvant therapy, i.e. therapy be- 
fore or after surgical removal of the tumour, like a cytotoxic therapy (chemotherapy) in addi- 
tion to or instead of (for example in hormone receptor negative patients) an endocrine treat- 
ment, like treatment with Tamoxifen or aromatase inhibitors, wherein the evaluation is based 
on the prediction of the patient's survival. 

One aspect of the invention is the provision of tools for predicting the survival of a patient 
diagnosed with a breast cell proliferative disease, such as breast cancer. These tools comprise 
methods for the analysis of either the expression levels of PITX2 protein, or PITX2 mRNA or 
the analysis of the patient^s individual genetic or epigenetic modification of the gene PITX2 — 
simimarised as the analysis of expression of the gene PITX2. Preferably the invention relates 
to methods for predicting the survival of a patient diagnosed with breast cancer. Preferably 
said patient is treated with at least one adjuvant endocrine treatment, wherein endocrine 
treatment is meant to comprise any treatment targeting the estrogen receptor pathway or es- 
trogen synthesis pathway or estrogen conversion pathway i.e., which is involved in estrogen 
metabolism, production or secretion. Preferably the patient was treated with said adjuvant 
endocrine treatment after surgical removal of the tumour. Also preferably the survival is the 
disease free survival. 

Especially preferred are methods applied for the prediction of the disease free survival of a 
patient diagnosed with breast cancer under adjuvant endocrine treatment after surgical tumour 
removal. Even more preferred are those methods, which analyse the DNA methylation profile 
of the genomic region associated with the gene PITX2. Especially preferred is the analysis of 
the DNA methylation profile of the genomic sequence given in SEQ ID 1 . Especially pre- 
ferred is furthermore the analysis of the methylation status of eight specific CpG dinucleo- 
tides, covered in the three subsequences of said SEQ ID 1 given in SEQ ID NOS: 13, 18 and 
19. The use of nucleic acids hybridising to these nucleic acid sequences for the prediction of 
survival according to the invention are preferred embodiments of said invention. The use of 
nucleic acids hybridising to CpG positions within these nucleic acid sequences after these 
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nucleic acids have been contacted with one or more agents that convert cytosine bases that are 
unmethylated at the 5 '-position thereof to a base that is detectably dissimilar to cytosine in 
terms of hybridisation properties, for the prediction of survival according to the invention are 
especially preferred embodiments of said invention. 

Tliis methodology presents further improvements over the state of the art in that the method 
may be applied to any subject, independent of the estrogen and/or progesterone receptor 
status. Therefore in a preferred embodiment, the subject is not required to have been tested for 
estrogen or progesterone receptor status. 

The object of the invention is preferably achieved by means of the analysis of the methylation 
pattern of PITX2 and/or its regvdatory region. In a particularly preferred embodiment the se- 
quence of said gene comprises SEQ ID 1 and the sequence complementary thereto. 

In one preferred embodiment the object of the invention is the prediction of survival of a 
subject imder a treatment which targets the estrogen receptor pathway or is involved in estro- 
gen metabolism, production or secretion. This is achieved by analysis of the expression pat- 
tern of PITX2 and wherein it is further preferred that the sequence of said gene comprises 
SEQ ID NO: 1 or parts thereof. 

In one aspect the invention discloses novel methods utilising the gene PITX2 for the predic- 
tion of survival of a patient diagnosed with a breast cell proliferative disease. In a preferred 
embodiment said patient diagnosed with a breast cell proliferative disease is treated with ad- 
juvant endocrine monotherapy. 

The invention discloses the use of the gene PITX2, as well as its promoter and regulatory 
elements as a prognostic marker for survival of breast cancer patients. It is preferred that these 
patients are treated with adjuvant endocrine monotherapy. More specifically, the disclosed 
matter shows the applicability of said gene to answer the question and help guiding the deci- 
sion whether or not an adjuvant chemotoxic therapy shall be prescribed, preferably in addition 
to endocrine treatment, like the treatment with tamoxifen or aromatase inhibitors. 

In one aspect of the invention, the disclosed matter provides novel nucleic acid sequences 
"usauTfo^^^ 
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gene and the gene product as well as methods, assays and kits directed to prognosing the sur- 
vival of a patient diagnosed with breast cell proliferative disease* Preferably said patient is 
treated with adjuvant endocrine monotherapy. 

In one embodiment the method discloses the use of the gene PITX2 as a marker for the prog- 
nosis of the survival of a patient suffering from a breast cell proliferative disease. Preferably 
said patient is treated with adjuvant endocrine monotherapy. Said use of the gene may be en- 
abled by means of any analysis of the expression of the gene, by means of mRNA expression 
analysis or protein expression analysis or by analysis of its genetic modifications leading to an 
altered expression. However, in the most preferred embodiment of the invention, prediction of 
the survival of a patient diagnosed with breast cell proliferative disease, preferably treated 
with adjuvant endocrine monotherapy, is enabled by means of analysis of the methylation 
status of CpG sites within the gene PITX2 and its promoter or regulatory elements. 

To detect the levels of mRNA encoding PITX2 in a detection system for breast cancer re- 
lapse, a sample is obtained from a patient. Said obtaining of a sample is not meant to be re- 
trieving of a sample, as in performing a biopsy, but rather directed to the availability of an 
isolated biological material representing a specific tissue, relevant for the intended use. The 
sample can be a tumour tissue sample from the surgically removed tumour, a biopsy sample 
as taken by a surgeon and provided to the analyst or a sample of blood, plasma, serum or the 
like. The sample may be treated to extract the nucleic acids contained therein. The resulting 
nucleic acid from the sample is subjected to gel electrophoresis or other separation tech- 
niques. Detection involves contacting the nucleic acids and in particular the mRNA of the 
sample with a DNA sequence serving as a probe to form hybrid duplexes. The stringency of 
hybridisation is determined by a number of factors during hybridisation and during the wash- 
ing procedure, including temperature, ionic strength, length of time and concentration of 
formamide. These factors are outlined in, for example, Sambrook et al. (Molecular Cloning: 
A Laboratory Manual, 2nd ed., 1989). Detection of the resulting duplex is usually accom- 
plished by the use of labelled probes. Alternatively, the probe may be unlabeled, but may be 
detectable by specific binding with a ligand which is labelled, either directly or indirectly. 
Suitable labels and methods for labelling probes and ligands are known in the art, and include, 
for example, radioactive labels which may be incorporated by known methods (e.g., nick 
translation or kinasing), biotin, fluorescent groups, chemiluminescent groups (e.g., di- 
oxetanes, particularly triggered dioxetanes), enzymes, antibodies, and the like. 
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In order to increase the sensitivity of the detection in a sample of mRNA encoding PITX2, the 
technique of reverse transcription/polymerisation chain reaction can be used to amplify cDNA 
transcribed from mRNA encoding PITX2. The method of reverse transcription/PCR is well 
known in the art (for example, see Watson and Fleming, supra). 

The reverse transcription/PCR method can be performed as follows. Total cellular RNA is 
isolated by, for example, the standard guanidium isothiocyanate method and the total RNA is 
reverse transcribed. The reverse transcription metiiod involves synthesis of DNA on a tem- 
plate of RNA using a reverse transcriptase enzyme and a 3' end primer. Typically, the primer 
contains an oligo(dT) sequence. The cDNA thus produced is then amplified using the PGR 
method and PITX2 specific primers. (Belyavsky et al, Nucl Acid Res 17:2919-2932, 1989; 
Krug and Berger, Methods in Enzymology, Academic Press,N.Y., Vol.152, pp. 316-325, 
1 987 which are incorporated by reference) 

The present invention may also be described in certain embodiments as a kit for use in pre- 
dicting the survival of a breast cancer patient before or after surgical tumour removal with or 
without adjuvant endocrine monotherapy state through testing of a biological sample. A rep- 
resentative kit may comprise one or more nucleic acid segments as described above that se- 
lectively hybridise to PITX2 mRNA and a container for each of the one or more nucleic acid 
segments. In certain embodiments the nucleic acid segments may be combined in a single 
tube. In further embodiments, the nucleic acid segments may also include a pair of primers for 
amplifying the target mRNA. Such kits may also include any buffers, solutions, solvents, en- 
zymes, nucleotides, or other components for hybridisation, amplification or detection reac- 
tions. Preferred kit components include reagents for reverse transcription-PCR, in situ hy- 
bridisation. Northern analysis and/or RPA. 

The present invention further provides for methods to detect the presence of the polypeptide, 
PITX2, in a sample obtained from a patient. It is preferred that said sequence is essentially the 
same as the sequence presented in SEQ ID 20, as given in figure 10. Any method known in I 
the art for detecting proteins can be used. Such methods include, but are not limited to immu-| 
nodiffusion, Immunoelectrophoresis, immunochemical methods, binder-ligand assays, immu-| 
nohistochemical techniques, agglutination and complement assays, (for example see Basic I 
aad" ClTmcarrn^ 
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262, 1991 which is incorporated by reference). Preferred are binder-ligand immunoassay 
methods including reacting antibodies with an epitope or epitopes of PITX2 and competi- 
tively displacing a labelled PITX2 protein or derivative thereof. 

Certain embodiments of the present invention comprise the use of antibodies specific to the 
polypeptide encoded by the PITX2 gene. Such antibodies may be useful for prognosing the 
survival of a breast cancer patient preferably under adjuvant endocrine monotherapy by com- 
paring a patienf s levels of PITX2 marker expression to expression of the same marker in 
normal individuals. In certain embodiments production of monoclonal or polyclonal antibod- 
ies can be induced by the use of the PITX2 polypeptide as antigene. Such antibodies may in 
turn be used to detect expressed proteins as markers for prognosis of relapse of a breast cancer 
patient under adjuvant endocrine monotherapy. The levels of such proteins present in the pe- 
ripheral blood of a patient may be quantified by conventional methods. Antibody-protein 
binding may be detected and quantified by a variety of means known in the art, such as label- 
ling with fluorescent or radioactive ligands. The invention further comprises kits for per- 
forming the above-mentioned procedures, wherein such kits contain antibodies specific for 
the PITX2 polypeptides. 

Numerous competitive and non-competitive protein binding immunoassays are well known in 
the art. Antibodies employed in such assays may be unlabeled, for example as used in agglu- 
tination tests, or labelled for use a wide variety of assay methods. Labels that can be used in- 
clude radionuclides, enzymes, fluorescers, chemiluminescers, enzyme substrates or co-factors, 
enzyme inhibitors, particles, dyes and the like for use in radioimmimoassay (RIA), enzyme 
immunoassays, e.g., enzyme-linked immunosorbent assay (ELISA), fluorescent immunoas- 
says and the like. Polyclonal or monoclonal antibodies to PITX2 or an epitope thereof can be 
made for use in immunoassays by any of a number of methods known in the art. One ap- 
proach for preparing antibodies to a protein is the selection and preparation of an amino acid 
sequence of all or part of the protein, chemically synthesising the sequence and injecting it 
into an appropriate animal, usually a rabbit or a mouse (Milstein and Kohler Nature 256:495- 
497, 1975; Gulfre and Milstein, Methods in Enzymology: Immimochemical Techniques 73:1- 
46, Langone and Banatis eds.. Academic Press, 1981 which are incorporated by reference). 
Methods for preparation of PITX2 or an epitope thereof include, but are not limited to chemi- 
cal S3aithesis, recombinant DNA techniques or isolation from biological samples. 
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The invention provides significant improvements over the state of the art in that -at the time 
of filing- there are no single markers known to the public which can be used to predict the 
likelihood of relapse or of survival of a breast cancer patient under adjuvant endocrine 
monotherapy, neither from tissue samples nor from body fluid samples. 

Also, no methylation marker is known which can be used to detect the likelihood of relapse or 
of survival of a breast cancer patient. Especially, no methylation marker is known which can 
be used to detect the likelihood of relapse or of survival of a breast cancer patient under adju- 
vant endocrine monotherapy, neither from tissue samples nor from body fluid samples. 

The objective of the invention is also preferably achieved by analysis of the methylation state 
of the CpG dinucleotides within the genomic sequence according to SEQ ID NO: 1 and se- 
quences complementary thereto. SEQ ID NO: 1 discloses the gene PITX2 and its promoter 
and regulatory elements, wherein said fragment comprises CpG dinucleotides exhibiting a 
disease specific methylation pattem. The methylation pattern of the gene PITX2 and its pro- 
moter and regulatory elements have heretofore not been analysed with regard to prognosis or 
prediction of survival of a patient diagnosed with a breast cell proliferative disorder. Due to 
the degeneracy of the genetic code, the sequence as identified in SEQ ID NO: 1 should be 
interpreted so as to include all substantially similar and equivalent sequences upstream of the 
promoter region of a gene which encodes a polypeptide with the biological activity of that 
encoded by PITX2. 

In a preferred embodiment of the method, the objective of the invention is achieved by analy- 
sis of a nucleic acid comprising a sequence of at least 1 8 bases in length according to one of 
SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary thereto. 

The sequences of SEQ ID NOS: 2 to 5 provide modified versions of the nucleic acid accord- 
ing to SEQ ID NO: 1, wherein the conversion of said sequence results in the synthesis of a 
nucleic acid having a sequence that is unique and distinct from SEQ ID NO: 1 as follows, (see 
also the following TABLE 1): SEQ ID NO: 1, sense DNA strand of PITX2 gene and its pro- 
moter and regulatory elements; SEQ ID NO: 2, converted SEQ ID NO: 1, wherein "C" n 
verted to "T," but "CpG" remains "CpG." corresponds to case where, for SEQ ID NO: 1, 
all "C" residues of CpG dinucleotide sequences are methylated and are thus not converted); 
SEQ m NO^ 3, compTe^^^ 
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remains "CpG" (/.e., corresponds to case where, for the complement (antisensc strand) of 
SEQ ID NO: 1, all "C" residues of CpG dinucleotide sequences are methylated and are thus 
not converted); SEQ ID NO; 4, converted SEQ ID NO: 1, wherein "C'! cwerted to "T" for 
all residues, including those of "CpG" dinucleotide sequences (/.e., corresponds to case 
where, for SEQ ID NO: 1, all "C" residues of CG dinucleotide sequences are unmethylated); 
SEQ ID NO: 5, complement of SEQ ID NO: 1, wherein "C" converted to"T" for all "C" resi- 
dues, including those of "CpG" dinucleotide sequences corresponds to case where, for 
the complement (awrtsense strand) of SEQ ID NO: 1, all "C" residues of CpG dinucleotide 
sequences are xmmethylated). 



TABLE 1. Description of SEQ ID NOS: 1 to 5 



SEQ ID NO 


Relationship to 
SEQ ID NO:l 


Nature of cytosiiie base conversion 


SEQ ID NO: 1 


Sense strand (PITX2 gene 
including promoter and 
regulatory elements) 


None; untreated sequence 


SEQ ID NO:2 


Converted sense strand 


"C"to"T," but "CpG" remains "CpG" (all 
"C" residues of CpGs are methylated) 


SEQ ID NO:3 


Converted antisense strand 


«C" to'T," but "CpG" remains "CpG" (all 
"C" residues of CpGs are methylated) 


SEQ ID NO:4 


Converted sense strand 


"C" to"T" for all "C" residues (all "C" resi> 
dues of CpGs are unmethylated) 


SEQ ID NO:5 


Converted antisense strand 


"C"to"T" for all "C" residues (all "C" resi- 
dues of CpGs are unmethylated) 



Significantly, heretofore, the nucleic acid sequences and molecules according to SEQ ID NO: 
1 to SEQ ID NO: 5 were not implicated in or connected with the ascertainment of the progno- 
sis of breast cancer relapse or the prediction of survival of breast cancer patients. 



The described invention further discloses oligonucleotides or oligomers for detecting the cy- 
tosine methylation state within pretreated DNA, according to SEQ ID NO: 2 to SEQ ID NO: 
5. The use of said oligonucleotides or oligomers comprising a nucleic acid sequence having a 
length of at least nine (9) nucleotides which hybridise, under moderately stringent or stringent 
conditions (as defined herein above), to a pretreated nucleic acid sequence according to SEQ 
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ID NO: 2 to SEQ ID NO: 5 and/or sequences complementary thereto is another embodiment 
of this invention. 

Thus, the present invention includes the use of nucleic acid molecules (e.g.^ oligonucleotides 
and peptide nucleic acid (PNA) molecules (PNA-oligomers)) that hybridise under moderately 
stringent and/or stringent hybridisation conditions to all or a portion of the sequences of SEQ 
ID NOS: 2 to 5, or to the complements thereof for prediction of survival according to the in- 
vention. The hybridising portion of the hybridising nucleic acids is typically at least 9, 15, 20, 
25, 30 or 35 nucleotides in length. However, longer molecules have inventive utility, and are 
thus within the scope of the present invention. 

Preferably, the hybridising portion of the inventive hybridising nucleic acids is at least 95%, 
or at least 98%, or 100% identical to the sequence, or to a portion thereof of SEQ ID NOS: 2 
to 5, or to the complements thereof. 

Hybridising nucleic acids of the type described herein can be used, for example, as a primer 
(e,g,^ a PGR primer), or a diagnostic and/or prognostic probe or primer. Preferably, hybridisa- 
tion of the oligonucleotide probe to a nucleic acid sample is performed under stringent condi- 
tions and the probe is 1 00% identical to the target sequence. Nucleic acid duplex or hybrid 
stability is expressed as the melting temperature or Tm, which is the temperature at which a 
probe dissociates from a target DNA. This melting temperature is used to define the required 
stringency conditions. 

For target sequences that are related and substantially identical to the corresponding sequence 
of SEQ ID NO: 1 (such as PITX2 allelic variants and SNPs), rather than identical, it is useful 
to first establish the lowest temperature at which only homologous hybridisation occurs with a 
particular concentration of salt (e.g., SSC or SSPE). Then, assuming that 1% mismatching 
results in a 1**C decrease in the Tm, the temperature of the final wash in the hybridisation re- 
action is reduced accordingly (for example, if sequences having > 95% identity with the probe 
are sought, the final wash temperature is decreased by 5°C). In practice, the change in Tm can 
be between 0.5°C and 1.5^C per 1% mismatch. 

Examples of inventive oligonucleotides of length X (in nucleotides), as indicated by polynu- 
cleotide positions with reference to, e.g., SEQ ID NO: 1, inciude Siose corresponding to sets 
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of consecutively overlapping oligonucleotides of length X, vs^here tiie oligonucleotides within 
each consecutively overlapping set (corresponding to a given X value) are defined as the fi- 
nite set of Z oligonucleotides from nucleotide positions: 

nto(n+pC-l)); 

where n=l, 2, 3,...(Y-(X-1)); 

where Y equals the length (nucleotides or base pairs) of SEQ ID NO: 1 ; 
where X equals the conmion length (in nucleotides) of each oligonucleotide in 
the set (e.^-, X=20 for a set of consecutively overlapping 20-mers); and 
where the number (Z) of consecutively overlapping oligomers of length X for a 
given SEQ ID NO of length Y is equal to Y-(X-I). For example Z=2,785- 
19=2,766 for either sense or antisense sets of SEQ ID NO: 1, where X=20. 

Preferably, the set is limited to those oligomers that comprise at least one CpG, Cpa or tpG 
dinucleotide, wherein *Cpa' is indicating that said Cpa hybridises to a position (tpG) which 
was a CpG prior to bisulfite conversion and is a TpG now; and wherein 'tpG' is indicating 
that said tpG hybridises to a position (Cpa) which is the complementary to a position (tpG) 
which was a CpG prior to bisulfite conversion and is a TpG now. 

The present invention encompasses, for each of SEQ ID NOS: 2 to 5 (sense and antisense), 
the use of multiple consecutively overlapping sets of oligonucleotides or modified oligonu- 
cleotides of length X, where, e.g., X= 9, 10, 17, 20, 22, 23, 25, 27, 30 or 35 nucleotides. 

The oligonucleotides or oligomers according to the present invention constitute effective tools 
useful to ascertain genetic and epigenetic parameters of the genomic sequence corresponding 
to SEQ ID NO: 1. Preferred sets of such oligonucleotides or modified oligonucleotides of 
length X are those consecutively overlapping sets of oligomers corresponding to SEQ ID 
NOS: 1-5 (and to the complements thereof). Preferably, said oligomers comprise at least one 
CpG, tpG or Cpa dinucleotide. 

Particularly preferred oligonucleotides or oligomers used to the present invention are those in 
which the cytosine of the CpG dinucleotide (or of the corresponding converted TpG or CpA 
dinucleotide) sequences is within the middle third of the oligonucleotide; that is, where the 
oligonucleotide is, for example, 13 bases in length, the CpG, TpG or CpA dinucleotide is po- 
sitioned within the fifth to ninth nucleotide from the 5'-end. 
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The oligonucleotides used in this invention can also be modified by chemicsdly linking the 
oligonucleotide to one or more moieties or conjugates to enhance the activity, stability or de- 
tection of the oligonucleotide. Such moieties or conjugates include chromophores, fluoro- 
phors, lipids such as cholesterol, cholic acid, thioether, aliphatic chains, phospholipids, poly- 
amines, polyethylene glycol (PEG), palmityl moieties, and others as disclosed in, for exam- 
ple. United States Patent Nimibers 5,514,758, 5,565,552, 5,567,810, 5,574,142, 5,585,481, 
5,587,371, 5,597,696 and 5,958,773. The probes may also exist in the form of a PNA (peptide 
nucleic acid) which has particularly preferred pairing properties. Thus, the oligonucleotide 
may include other appended groups such as peptides, and may include hybridisation-triggered 
cleavage agents (Krol et al., BioTechniques 6:958-976, 1988) or intercalating agents (Zon, 
PharnL Res, 5:539-549, 1988). To this end, the oligonucleotide may be conjugated to another 
molecule, e.g.^ a chromophore, fluorophor, peptide, hybridisation-triggered cross-linking 
agent, transport agent, hybridisation-triggered cleavage agent, etc. 

The oligonucleotide may also comprise at least one art-recognised modified sugar and/or base 
moiety, or may comprise a modified backbone or non-natural intemucleoside linkage. 

The oligomers used in the present invention are normally used in so called "sets" which con- 
tain at least one oligomer for analysis of each of the CpG dinucleotides of a genomic se- 
quence comprising SEQ ID NO: 1 and sequences complementary thereto or to their corre- 
sponding CO, to or Ca dinucleotide within the pretreated nucleic acids according to SEQ ID 
NO: 2 to SEQ ID NO: 5 and sequences complementary thereto, wherein a *t' indicates a nu- 
cleotide which converted from a cytosine into a thymine and wherein 'a' indicates the com- 
plementary nucleotide to such a converted thymine. Preferred is a set which contains at least 
one oligomer for each of the CpG dinucleotides witiiin the gene PITX2 and it's promoter and 
regulatory elements in both the pretreated and genomic versions of said gene, SEQ ID NO: 2 
to 5 and SEQ ID NO: 1, respectively. However, it is anticipated that for economic or other 
factors it may be preferable to analyse a limited selection of the CpG dinucleotides within 
said sequences and the contents of the set of oligonucleotides should be altered accordingly. 
Therefore, the present invention moreover relates to a set of at least 3 n (oligonucleotides 
and/or PNA-oligomers) used for detecting the cytosine methylation state in pretreated geno- 
mic DNA (SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary thereto) and ge- 
nomic~DNS~(SEQ^ ID NO:~ T and" seque^^^^ These~pr61ies ena^^^ 
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diagnosis and/or therapy of genetic and epigenetic parameters of cell proliferative disorders. 
The set of oligomers may also be used for detecting single nucleotide polymorphisms (SNPs) 
in pretreated genomic DNA (SEQ ID NO: 2 to SEQ ID NO: 5, and sequences complementary 
thereto) and genomic DNA (SEQ ID NO: 1, and sequences complementary thereto) . 

Moreover, the present invention includes the use of a set of at least two oligonucleotides 
which can be used as so-called "primer oligonucleotides" for amplifying DNA sequences of 
one of SEQ ID NO: 1 to SEQ ID NO: 5 and sequences complementary thereto, or segments 
thereof. 

In the case of the sets of oligonucleotides according to the present invention, it is preferred 
that at least one and more preferably all members of the set of oligonucleotides is bound to a 
solid phase. 

According to the present invention, it is preferred that an arrangement of different oligonu- 
cleotides and/or PNA-oligomers (a so-called "array") made available by the present invention 
is present in a manner that it is likewise bound to a solid phase. This array of different oligo- 
nucleotide- and/or PNA-oligomer sequences can be characterised in that it is arranged on the 
solid phase in the form of a rectangular or hexagonal lattice. The solid phase surface is pref- 
erably composed of silicon, glass, polystyrene, aluminiiun, steel, iron, copper, nickel, silver, 
or gold. However, nitrocellulose as well as plastics such as nylon which can exist in the form 
of pellets or also as resin matrices may also be used. 

Therefore, a further subject matter of the present invention is a method for manufacturing an 
array fixed to a carrier material for analysis in connection with cell proliferative disorders, in 
which method at least one oligomer according to the present invention is coupled to a solid 
phase. Methods for manufacturing such arrays are known, for example, from US Patent 
5,744,305 by means of solid-phase chemistry and photolabile protecting groups. 

A further subject matter of the present invention relates to a DNA chip for the analysis of cell 
proliferative disorders. DNA chips are known, for example, in US Patent 5,837,832. 

The described invention further provides a composition of matter useful for prognosing the 
relapse of breast cancer patients or predicting the survival of breast cancer patients. Said com- 
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position comprising at least one nucleic acid 18 base pairs in length of a segment of the nu- 
cleic acid sequence disclosed in SEQ ID NO: 2 to 5, and one or more substances taken firom 
the group comprising : 

1-5 mM Magnesium Chloride, 100-500 \xM dNTP, 0.5-5 units/lOjil of taq polymerase, bovine 
serum albumen, an oligomer in particular an oligonucleotide or peptide nucleic acid (PNA)- 
oligomer, said oligomer comprising in each case at least one base sequence having a length of 
at least 9 nucleotides which is complementary to, or hybridises under moderately stringent or 
stringent conditions to a pretreated genomic DNA according to one of the SEQ ID NO: 2 to 
SEQ ID NO: 5 and sequences complementary thereto. It is preferred that said composition of 
matter comprises a buffer solution appropriate for the stabilisation of said nucleic acid in an 
aqueous solution and enabling polymerase based reactions within said solution,. Suitable 
buffers are known in the art and commercially available. 

The present invention further provides a method for conducting an assay in order to ascertain 
genetic and/or epigenetic parameters of the gene PITX2 and its promoter and regulatory ele- 
ments. Most preferably the assay according to the following method is used in order to detect 
methylation within the gene PITX2 wherein said methylated nucleic acids are present in a 
solution further comprising an excess of backgroxmd DNA, wherein the backgroimd DNA is 
present in between 100 to 1000 times the concentration of the DNA to be detected. Said 
method comprising contacting a nucleic acid sample obtained from said subject with at least 
one reagent or a series of reagents, wherein said reagent or series of reagents, distinguishes 
between methylated and non-methylated CpG dinucleotides within the target nucleic acid. 

Preferably, said method comprises the following steps: In the first step, a sample of the tissue 
to be analysed is obtained. The source may be any suitable source, preferably, the source of 
the sample is selected from the group consisting of histological sUdes, biopsies, paraffin- 
embedded tissue, bodily fluids, plasma, serum, stool, urine, blood, nipple aspirate and combi- 
nations thereof. Preferably, the source is tumour tissue, biopsies, serum, urine, blood or nipple 
aspirate. The most preferred source, is the tumour sample, surgically removed from the pa- 
tient or a biopsy sample of said patient. 

The DNA is then isolated from the sample. Extraction may be by means that are standard to 
one skilled in the art, including the use of detergent lysates, sonification and vortexing with 
- -glass beads:~Once the nudieic acids~have"be^ir^xtracted7the ge^^ 
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used in the analysis. 

In the second step of the method, the genomic DNA sample is treated in such a manner that 
cytosine bases which are unmethylated at the 5 '-position are converted to tiracil, thymine, or 
another base which is dissimilar to cytosine in terms of hybridisation behaviour. This will be 
imderstood as 'pretreatment' herein. 

The above described pretreatment of genomic DNA is preferably carried out with bisulfite 
(hydrogen sulfite, disulfite) and subsequent alkaline hydrolysis which results in a conversion 
of non-methylated cytosine nucleobases to uracil or to another base which is dissimilar to 
cytosine in terms of base pairing behaviour. Enclosing the DNA to be analysed in an agarose 
matrix, thereby preventing the diffusion and renaturation of the DNA (bisulfite only reacts 
with single-stranded .DNA), and replacing all precipitation and purification steps with fast 
dialysis (Olek A, et al., A modified and improved method for bisulfite based cjrtosine meth- 
ylation analysis. Nucleic Acids Res, 24:5064-6, 1996) is one preferred example how to per- 
form said pretreatment. It is further preferred that the bisulfite treatment is carried out in the 
presence of a radical scavenger or DNA denaturing agent. 

In the third step of the method, fragments of the pretreated DNA are amplified. Wherein the 
source of the DNA is fi"ee DNA from serum, or DNA extracted from paraffin it is particularly 
preferred that the size of the amplificate fi-agment is between 100 and 200 base pairs in length, 
and wherein said DNA sovirce is extracted firom cellular sources (e.g. tissues, biopsies, cell 
lines) it is preferred that the amplificate is between ICQ and 350 base pairs in length. It is par- 
ticularly preferred that said amplificates comprise at least one 20 base pair sequence com- 
prising at least three CpG dinucleotides. Said amplification is carried out using sets of primer 
oligonucleotides according to the present invention, and a preferably heat-stable polymerase. 
The amplification of several DNA segments can be carried out simultaneously in one and the 
same reaction vessel, in one embodiment of the method preferably six or more fragments are 
amplified simultaneously. Typically, the amplification is carried out using a polymerase chain 
reaction (PGR). The set of primer oligonucleotides includes at least two oligonucleotides 
whose sequences are each reverse complementary, identical, or hybridise under stringent or 
highly stringent conditions to an at least 1 8-base-pair long segment of the base sequences of 
SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary thereto. 
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In one especially preferred embodiment of the method the primers may be selected form the 
group consisting to SEQ ID NO: 6 to SEQ ID NO: 10, 

In an alternate embodiment of the method, the methylation status of preselected CpG posi- 
tions within the nucleic acid sequences comprising SEQ ID NO: 2 to SEQ ID NO: 5 may be 
detected by use of methylation-specific primer oligonucleotides. This technique (MSP) has 
been described in United States Patent No. 6,265,171 to Herman. The use of methylation 
status specific primers for the amplification of bisulfite treated DNA allows the differentiation 
between methylated and unmethylated nucleic acids. MSP primers pairs contain at least one 
primer which hybridises to a bisulfite treated CpG dinucleotide. Therefore, the sequence of 
said primers comprises at least one CpG , TpG or CpA dinucleotide. MSP primers specific for 
non-methylated DNA contain a "T' at the 3' position of the C position in the CpG. Preferably, 
therefore, the base sequence of said primers is required to comprise a sequence having a 
length of at least 1 8 nucleotides which hybridises to a pretreated nucleic acid sequence ac- 
cording to SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary thereto, wherein 
the base sequence of said oligomers comprises at least one CpG, tpG or Cpa dinucleotide. In 
this embodiment of the method according to the invention it is particularly preferred that the 
MSP primers comprise between 2 and 4 CpG , tpG or Cpa dinucleotides. It is flirther pre- 
ferred that said dinucleotides are located within the 3' half of the primer e.g. wherein a primer 
is 1 8 bases in length the specified dinucleotides are located within the first 9 bases form the 
3 'end of the molecule. In addition to the CpG , tpG or Cpa dinucleotides it is fiuther preferred 
that said primers should further comprise several bisulfite converted bases (i.e. cytosine con- 
verted to thymine, or on the hybridising strand, guanine converted to adenosine). In a further 
preferred embodiment said primers are designed so as to comprise no more than 2 cytosine or 
guanine bases. 

The fragments obtained by means of the amplification can carry a directly or indirectly de- 
tectable label. Preferred are labels in the form of fluorescence labels, radionuclides, or detach- 
able molecule fragments having a typical mass which can be detected in a mass spectrometer. 
Where said labels are mass labels, it is preferred that the labelled amplificates have a single 
positive or negative net charge, allowing for better detectability in the mass spectrometer. The 
detection may be carried out and visualised by means of, e.g., matrix assisted laser desorp- 
tion/ionisation mass spectrometry (MALDT) or using electron spray mass spectrometry (ESI). 
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Matxix Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI-TOF) is a very 
efficient development for the analysis of biomolecules (Karas & Hillenkamp, Anal Chem., 
60:2299-301, 1988). An analyte is embedded in a light-absorbing matrix. The matrix is 
evaporated by a short laser pulse thus transporting the analyte molecule into the vapour phase 
in an unfragmented manner. The analyte is ionised by collisions with matrix molecules. An 
applied voltage accelerates the ions into a field-firee flight tube. Due to their different masses, 
the ions are accelerated at different rates. Smaller ions reach the detector sooner than bigger 
ones. MALDI-TOF spectrometry is well suited to the analysis of peptides and proteins. The 
analysis of nucleic acids is somewhat more difficult (Gut & Beck, Current Innovations and 
Future Trends, 1:147-57, 1995). The sensitivity with respect to nucleic acid analysis is ap- 
proximately 100-times less than for peptides, and decreases disproportionally with increasing 
fragment size. Moreover, for nucleic acids having a multiply negatively charged backbone, 
the ionisation process via the matrix is considerably less efficient In MALDI-TOF spec- 
trometry, the selection of the matrix plays an eminently important role. For the desorption of 
peptides, several very efficient matrixes have been foimd which produce a very fine crystalli- 
sation. There are now several responsive matrixes for DNA, however, the difference in sensi- 
tivity between peptides and nucleic acids has not been reduced. This difference in sensitivity 
can be reduced, however, by chemically modifying the DNA in such a manner that it becomes 
more similar to a peptide. For example, phosphorothioate nucleic acids, in which the usual 
phosphates of the backbone are substituted with thiophosphates, can be converted into a 
charge-neutral DNA using simple alkylation chemistry (Gut & Beck, Nucleic Acids Res, 23: 
1367-73, 1995). The coupling of a charge tag to this modified DNA results in an increase in 
MALDI-TOF sensitivity to the same level as that found for peptides, A further advantage of 
charge tagging is the increased stability of the analysis against impurities, which makes the 
detection of unmodified substrates considerably more difficult. 

In a particularly preferred embodiment of the method the amplification of step three is carried 
out in the presence of at least one species of blocker oligonucleotides. The use of such blocker 
oHgonucleotides has been described by Yu et al., BioTechniques 23:714-720, 1997. The use 
of blocking oligonucleotides enables the improved specificity of the amplification of a sub- 
population of nucleic acids. Blocking probes hybridised to a nucleic acid suppress, or hinder 
the polymerase mediated amplification of said nucleic acid. In one embodiment of the method 
blocking oligonucleotides are designed so as to hybridise to background DNA. In a fiirther 
embodiment of the method said oligonucleotides are designed so as to hinder or suppress the 
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amplification of immethylated nucleic acids as opposed to methylated nucleic acids or vice 
versa. 

Blocking probe oligonucleotides are hybridised to the bisulfite treated nucleic acid concur- 
rently with the PGR primers. PGR amplification of the nucleic acid is terminated at the 5' po- 
sition of the blocking probe, such that amplification of a nucleic acid is suppressed where the 
complementary sequence to the blocking probe is present. The probes may be designed to 
hybridise to the bisulfite treated nucleic acid in a methylation status specific manner. For ex- 
ample, for detection of methylated nucleic acids within a population of immethylated nucleic 
acids, suppression of the amplification of nucleic acids which are urunethylated at the position 
in question would be carried out by the use of blocking probes comprising a *TpG' at the po- 
sition in question, as opposed to a ^CpG.' In one embodiment of the method the sequence of 
said blocking oligonucleotides should be identical or complementary to molecule is comple- 
mentary or identical to a sequence at least 18 base pairs in length selected firom the group 
consisting of SEQ ID NOS: 2 to 5, preferably comprising one or more CpG, TpG or CpA 
dinucleotides. In one embodiment of the method the sequence of said oligonucleotides is se- 
lected firom the group consisting SEQ ID NO: 15 and SEQ ID NO: 16 and sequences com- 
plementary thereto. 

For PGR methods using blocker oligonucleotides, efficient dismption of polymerase-mediated 
amplification requires that blocker oligonucleotides not be elongated by the polymerase. Pref- 
erably, this is achieved through the use of blockers that are 3'-deoxyoligonucIeotides, or oli- 
gonucleotides derivatised at the 3 ' position with other than a "free" hydroxyl group. For ex- 
ample, 3'-0-acetyl oligonucleotides are representative of a preferred class of blocker mole- 
cule. 

Additionally, polymerase-mediated decomposition of the blocker oligonucleotides should be 
precluded. Preferably, such preclusion comprises either use of a polymerase lacking 5 '-3' 
exonuclease activity, or use of modified blocker oligonucleotides having, for example, thioate 
bridges at the 5 '-termini thereof that render the blocker molecule nuclease-resistant. Particular 
applications may not require such 5' modifications of the blocker. For example, if the 
blocker- and primer-binding sites overlap, thereby precluding binding of the primer (e.g., with 
excess blocker), degradation of the blocker oligonucleotide will be substantially precluded. 
TRislsnFecaiise the l)olyinerase wlttli61~exli^ aiidrfSr^^ 
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direction) the blocker - a process that normally results in degradation of the hybridised 
blocker oligonucleotide. 

A particularly preferred blocker/PCR embodiment, for purposes of the present invention and 
as implemented herein, comprises the use of peptide nucleic acid (PNA) oligomers as block- 
ing oligonucleotides. Such PNA blocker oligomers are ideally suited, because they are neither 
decomposed nor extended by the polymerase. 

In one embodunent of the method, the binding site of the blocking oligonucleotide is identical 
to, or overlaps with that of the primer and thereby hinders the hybridisation of the primer to 
its binding site. In a further preferred embodiment of the method, two or more such blocking 
oligonucleotides are used. In a particularly preferred embodiment, the hybridisation of one of 
the blocking oligonucleotides hinders the hybridisation of a forward primer, and the hybridi- 
sation of another of the probe (blocker) oligonucleotides hinders the hybridisation of a reverse 
primer that binds to the amplificate product of said forward primer. 

In an alternative embodiment of the method, the blocking oligonucleotide hybridises to a lo- 
cation between the reverse and forward primer positions of the treated background DNA, 
thereby hindering the elongation of the primer oligonucleotides. 

It is particularly preferred that the blocking oligonucleotides are present in at least 5 times the 
concentration of the primers. 

In the fourth step of the method, the amplificates obtained during the third step of the method 
are analysed in order to ascertain the methylation status of the CpG dinucleotides prior to the 
treatment. 

In embodiments where the amplificates were obtained by means of MSP amplification and/or 
blocking oligonucleotides, the presence or absence of an amplificate is in itself indicative of 
the methylation state of the CpG positions covered by the primers and or blocking oligonu- 
cleotide, according to the base sequences thereof.. All possible known molecular biological 
methods may be used for this detection, including, but not limited to gel electrophoresis, se- 
quencing, liquid chromatography, hybridisations, real time PGR analysis or combinations 
thereof. This step of the method further acts as a qualitative control of the preceding steps. 
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In the fourth step of the method ampiificates obtained by means of both standard and meth- 
ylation specific PGR are further analysed in order to determine the CpG methylation status of 
the genomic DNA isolated in the first step of the method. This may be carried out by means 
of hybridisation-based methods such as, but not limited to, array technology and probe based 
technologies as well as by means of techniques such as sequencing and template directed ex- 
tension. 

In one embodiment of the method, the ampiificates synthesised in step three are subsequently 
hybridised to an array or a set of oligonucleotides and/or PNA probes. In this context, the hy- 
bridisation takes place in the following manner; the set of probes used during the hybridisa- 
tion is preferably composed of at least 2 oligonucleotides or PNA-oligomers; in the process, 
the ampiificates serve as probes which hybridise to oligonucleotides previously bonded to a 
solid phase; the non-hybridised firagments are subsequently removed; said oligonucleotides 
contain at least one base sequence having a length of at least 9 nucleotides which is reverse 
complementary or identical to a segment of the base sequences specified in the SEQ ID NO: 2 
to SEQ ID NO: 5; and the segment comprises at least one CpG , TpG or CpA dinucleotide. 

In a preferred embodiment, said dinucleotide is present in the central third of the oligomer. 
For example, wherein the oligomer comprises one CpG dinucleotide, said dinucleotide is 
preferably the fifth to ninth nucleotide from the 5 '-end of a 13-mer. One oligonucleotide ex- 
ists for the analysis of each CpG dinucleotide v^thin the sequence according to SEQ ID NO: 
1, and the equivalent positions within SEQ ID NOS: 2 to 5. Said oligonucleotides may also be 
present in the form of peptide nucleic acids. The non-hybridised ampiificates are then re-- 
moved. The hybridised ampiificates are detected. In this context, it is preferred that labels 
attached to the ampiificates are identifiable at each position of the solid phase at which an 
oligonucleotide sequence is located. 

In yet a further embodiment of the method, the genomic methylation status of the CpG posi- 
tions may be ascertained by means of oligonucleotide probes that are hybridised to the bisul- 
fite treated DNA concurrently with the PCR amplification primers (wherein said primers may 
either be methylation specific or standard). 

:A paMculariy T5refa 
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Time Quantitative PGR (Heid et ai., Genome Res. 6:986-994, 1996; also see United States 
Patent No, 6,331,393). There are two preferred embodiments of utilising this method. One 
embodiment, known as the TaqMan™ assay employs a dual-labelled fluorescent oligonu- 
cleotide probe. The TaqMan™ PGR reaction employs the use of a non-extendible interrogat- 
ing oligonucleotide, called a TaqMan™ probe, which is designed to hybridise to a CpG-rich 
sequence located between the forward and reverse amplification primers. The TaqMan™ 
probe further comprises a fluorescent "reporter moiety" and a "quencher moiety" covalently 
bound to linker moieties {e.g.^ phosphoramidites) attached to the nucleotides of the TaqMan™ 
oligonucleotide. Hybridised probes are displaced and broken down by the polymerase of the 
amplification reaction thereby leading to an increase in fluorescence. For analysis of methyla- 
tion within nucleic acids subsequent to bisulfite treatment, it is required that the probe be 
methylation specific, as described in United States Patent No. 6,331,393, Oereby incorporated 
by reference in its entirety) also known as the MethyLight assay. The second preferred em- 
bodiment of this MethyLight technology is the use of dual-probe technology (Lightcycler®), 
each probe carrying donor or recipient fluorescent moieties, hybridisation of two probes in 
proximity to each other is indicated by an increase or fluorescent amplification primers. Both 
these techniques may be adapted in a manner suitable for use with bisulfite treated DNA, and 
moreover for methylation analysis within GpG dinucleotides. 

Also any combination of these probes or combinations of these probes with other known 
probes may be used. 

In a further preferred embodiment of the method, the fourth step of the method comprises the 
use of template-directed oligonucleotide extension, such as MS-SNuPE as described by Gon- 
zalgo & Jones, Nucleic Acids Res. 25:2529-2531, 1997. In said embodiment it is preferred 
that the methylation specific single nucleotide extension primer (MS-SNuPE primer) is iden- 
tical or complementary to a sequence at least nine but preferably no more than twenty five 
nucleotides in length of one or more of the sequences taken from the group of SEQ ID NO: 2 
to SEQ ID NO: 5. However it is preferred to use fluorescently labelled nucleotides, instead of 
radiolabelled nucleotides. 

In yet a further embodiment of the method, the fourth step of the method comprises sequenc- 
ing and subsequent sequence analysis of the amplificate generated in the third step of the 
method (Sanger F., et al., Proc Natl Acad Sci USA 74:5463-5467, 1977). 
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Additional embodiments of the invention provide a method for the analysis of the methylation 
status of genomic DNA according to the invention (SEQ ID NO: 1) without the need for pre- 
treatment. 

In the first step of such additional embodiments, the genomic DNA sample is isolated from 
tissue or cellular sources. Preferably, such sources include cell lines, histological slides, bi- 
opsy tissue, body fluids, or breast tumour tissue embedded in parafiSn. Extraction may be by 
means that are standard to one skilled in the art, including but not limited to the use of deter- 
gent lysates, sonification and vortexing with glass beads. Once the nucleic acids have been 
extracted, the genomic double-stranded DNA is used in the analysis. 

In a preferred embodiment, the DNA may be cleaved prior to the treatment, and this may be 
by any means standard in the state of the art, but preferably with methylation-sensitive re- 
striction endonucleases. 

In the second step^ the DNA is then digested with one or more methylation sensitive restric- 
tion enzymes. The digestion is carried out such that hydrolysis of the DNA at the restriction 
site is informative of the methylation status of a specific CpG dinucleotide. 

In the third step, which is optional but a preferred embodiment, the restriction fragments are 
amplified. This is preferably carried out using a polymerase chain reaction, and said amplifi- 
cates may carry suitable detectable labels as discussed above, namely fluorophore labels, ra- 
dionuclides and mass labels. 

In the final step the amplificates are detected. The detection may be by any means standard in 
the art, for example, but not limited to, gel electrophoresis analysis, hybridisation analysis, 
incorporation of detectable tags within the PGR products, DNA array analysis, MALDI or 
ESI analysis. 

The present invention enables prognosis of events which are disadvantageous to patients or 
individuals in which important genetic and/or epigenetic parameters within the PITX2 gend 
and its promoter or regulatory elements may be used as prognostic markers for breast cancen 
'f Sapse oFas "adjiiva^ marker^for predicfioB o^ "of endoJ 
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crine monotherapy. Said parameters obtained by means of the present invention may be com- 
pared to another set of genetic and/or epigenetic parameters, the differences serving as the 
basis for a prognosis of events which are disadvantageous to patients or individuals. 

Specifically, the present invention provides for prognostic cancer relapse assays based on 
measurement of differential methylation of PITX2 CpG dinucleotide sequences. Preferred 
gene sequences useful to measure such differential methylation are represented herein by SEQ 
ID NOS: 1 to 5. Typically, such assays involve obtaining a tissue sample firom a test tissue, 
performing an assay to measure the methylation status of at least one of the inventive PITX2- 
specific CpG dinucleotide sequences derived from the tissue sample, relative to a control 
sample, and making a diagnosis or prognosis or prediction based thereon. 

In particiilar preferred embodiments, inventive oligomers to assess PITX2 specific CpG dinu- 
cleotide methylation status, such as those based on SEQ ID NOS: 1 to 5, or arrays thereof, as 
well as a kit based thereon are used for the prognosis of breast cancer relapse and/or the pre- 
diction of survival of a patient diagnosed with breast cancer, preferably under endocrine 
treatment since surgical removal of the tumour. 

Moreover, an additional aspect of the present invention is a kit comprising, for example: a 
bisulfite-containing reagent as well as at least one oligonucleotide whose sequences in each 
case correspond, are complementary, or hybridise imder stringent or highly stringent condi- 
tions to a 18-base long segment of the sequences SEQ ID NOS: 1 to 5. Said kit may further 
comprise instructions for carrying out and evaluating the described method. In a finther pre- 
ferred embodiment, said kit may further comprise standard reagents for performing a CpG 
position-specific methylation analysis, wherein said analysis comprises one or more of the 
following techniques: MS-SNuPE, MSP, MethyLight T^HeavyMethyl™ ^ COBRA, and nu- 
cleic acid sequencing. However, a kit along the lines of the present invention can also contain 
only part of the aforementioned components. 

Typical reagents (e.g., as might be found in a typical COBRA-based kit) for COBRA analysis 
may include, but are not limited to: PCR primers for specific gene (or methylation-altered 
DNA sequence or CpG island); restriction enzyme and appropriate buffer; gene-hybridisation 
oligo; control hybridisation oligo; kinase labelling kit for oligo probe; and radioactive nucleo- 
tides. Additionally, bisulfite conversion reagents may include: DNA denaturation buffer; sul- 



38 



fonation biiffer; DNA recovery reagents or kits (e.g., precipitation, ultrafiltration, affinity col- 
umn); desulfonation buffer; and DNA recovery components. 

Typical reagents (e.g., as might be found in a typical MethyLight®-based kit) for Meth- 
yLight® analysis may include, but are not limited to: PGR primers for specific gene (or meth- 
ylation-altered DNA sequence or CpG island); TaqMan® probes; optimised PGR buffers and 
deoxynucleotides; and Taq polymerase. 

Typical reagents {e.g., as might be found in a typical Ms-SNuPE-based kit) for Ms-SNuPE 
analysis may include, but are not limited to: PGR primers for specific gene (or methylation- 
altered DNA sequence or GpG island); optimised PGR buflfers and deoxynucleotides; gel ex- 
traction kit; positive control primers; Ms-SNuPE primers for specific gene; reaction buffer 
(for the Ms-SNuPE reaction); and radioactive nucleotides. Additionally, bisulfite conversion 
reagents may include: DNA denaturation buffer; sulfonation buffer; DNA recovery regents or 
kit (e.g., precipitation, ultrafiltration, affinity column); desulfonation buffer; and DNA recov- 
ery components- 
Typical reagents (e.g., as might be found in a typical MSP-based kit) for MSP analysis may 
include, but are not limited to: methylated and utunethylated PGR primers for specific gene 
(or methylation-altered DNA sequence or GpG island), optimised PGR buffers and deoxynu- 
cleotides, and specific probes. 

Specifically, the present invention is related to a method for characterising a cell proliferative 
disorder of the breast tissues and/or predicting the survival of a patient diagnosed with said 
disorder, comprising the steps of: a) detecting the expression of a nucleic acid or a polypep- 
tide expressed from the PITX2 gene in an isolated biological sample representative of said 
cell proliferative disorders of the breast tissues and b) predicting therefirom the survival of 
said patient, characteristics of said cell proliferative disorder, and/or prognosis of said patient. 
Preferably, the method according to the present invention further comprises c) determining a 
suitable treatment regimen for the subject. 

Preferred is a method according to the present invention, wherein the patient is characterised 
by being subject to adjuvant endocrine therapy comprising one or more treatments which tar- 
gerthe~eslfogeinfecep^^ 
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cretion. 

Preferred is also a method according to the present invention, wherein said breast cell prolif- 
erative disorders are taken from the group comprising ductal carcinoma in situ, invasive duc- 
tal carcinoma, invasive lobular carcinoma, lobular carcinoma in situ, comedocarcinoma, in- 
flammatory carcinoma, mucinous carcinoma, scirrhous carcinoma, colloid carcinoma, tubular 
carcinoma, medullary carcinoma, metaplastic carcinoma, and papillary carcinoma and papil- 
lary carcinoma in situ, imdifferentiated or anaplastic carcinoma and Paget's disease of the 
breast. 

According to another aspect of the method according to the present invention, said method is 
characterised in that the detection is carried out by a) contacting said biological sample with 
an antibody immunoreactive with the PITX2 polypeptide to form an immunocomplex; b) de- 
tecting said immunocomplex; and c) predicting therefrom the survival of said patient, char- 
acteristics of said cell proliferative disorder, and/or prognosis of said patient 

According to another aspect of the method according to the present invention, said method is 
characterised in that the detection is carried out by a) contacting said biological sample with 
an antibody immxmoreactive with the PITX2 polypeptide to form an immunocomplex; b) de- 
tecting said immunocomplex; c) therefrom predicting the survival of said patient, characteris- 
tics of said cell proliferative disorder, and/or prognosis of said patient, and d) comparing the 
quantity of said immunocomplex to the quantity of immimocomplex formed under identical 
conditions with the same antibody and a control sample from one or more patients with a 
known prognosis. 

According to yet another aspect of the method according to the present invention, the detec- 
tion is carried out by a) contacting said biological sample with an antibody inmiunoreactive 
with the PITX2 polypeptide to form an immunocomplex; b) detecting said immunocomplex; 
and c) predicting therefrom the survival of said patient, characteristics of said cell prolifera- 
tive disorder, and/or prognosis of said patient, and wherein an increase in quantity of said 
immunocomplex in the sample from said subject relative to said control sample is indicative 
of a bad prognosis. 
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Preferred is a method according to the present invention, wherein said immunoassay is a ra- 
dioimmunoassay or an ELISA or a Western blot. Preferred is a method according to the pres- 
ent invention, wherein said detection is afforded by mRNA expression analysis. Most pre- 
ferred is a method according to the present invention, comprising detecting the level of 
mRNA encoding a PITX2 polypeptide in a biological sample from a patient, 

Preferred is a method according to the present invention, wherein a increased concentration of 
said mRNA above the concentration determined for an individual known to have a good 
prognosis indicates a bad prognosis. 

According to yet another aspect of the method according to the present invention said method 
comprises the steps of: a) providing a polynucleotide probe which specifically hybridises or is 
identical to a polynucleotide consisting of SEQ ID NO: 19 or SEQ ID NO: 1, b) incubating 
said sample vsdth said polynucleotide probe under high stringency conditions to form a spe- 
cific hybridisation complex between a nucleic acid and said probe; and c) detecting said hy- 
bridisation complex. 

Preferred is a method according to the present invention wherein said nucleic acid is mRNA 
or a cDNA derived therefrom. 

Preferred is a method according to the present invention wherein the detecting step further 
comprises the steps of: a) producing a cDNA from mRNA in the sample; b) providing two 
oligonucleotides which specifically hybridise to regions flanking a segment of the cDNA; c) 
performing a polymerase chain reaction on the cDNA of step a) using the oligonucleotides of 
step b) as primers to amplify the cDNA segment; and d) detecting the amplified cDNA seg- 
ment. 

Yet another aspect of the present invention relates to a use of a polypeptide expressed from I 
the PITX2 gene for differentiating or distinguishing between patients diagnosed with breast 
cancer, who have a good survival prognosis and patients who have a bad sxnrvival prognosis. 
Preferably, said polypeptide is expressed from the PITX2 gene and used for prediction of sur- 
vival of a patient diagnosed with a cell proliferative disorder of the breast. 



-41 - 

Preferred is a method according to the present invention wherein said detection comprises 
determining the genetic parameters of the gene PITX2, its promoter and/or regulatory ele- 
ments. More preferred is a method according to the present invention vsrherein said detection 
comprises determining the epigenetic parameters of the gene PITX2, its promoter and/or 
regulatory elements. 

Preferred is a method according to the present invention, wherein said detection comprises 
determining the methylation status of one or more CpG positions of a target nucleic acid 
within the gene PITX2, its promoter and/or regulatory elements, in particular through the 
methylation analysis of a genomic DNA sequence according to SEQ ID NO: 1. Preferred is 
further a method according to the present invention, wherein said detection comprises deter- 
mining the methylation status of one or more CpG positions of a target nucleic acid charac- 
terised as being identical to or hybridising under stringent or moderately stringent conditions 
to a sequence out of the group of SEQ ID NOs 13, 18 and 19. Preferred is further a method 
according to the present invention, wherein the methylation analysis is afforded by contacting 
said target nucleic acid with one or more agents that convert cytosine bases that are unmeth- 
ylated at the 5 '-position thereof to a base that is detectably dissimilar to cytosine in terms of 
hybridisation properties. 

Preferred is a method according to the present invention, wherein contacting said target nu- 
cleic acids with one or more agents comprises use of a solution selected ftom the group con- 
sisting of bisulfite, hydrogen sulfite, disulfite, and combinations thereof. 

Yet another aspect of the present invention relates to the use of a set of oligomer probes com- 
prising at least two oligomers, in particular an oligonucleotide or peptide nucleic acid (PNA)- 
oligomer, said oligomer comprising in each case at least one base sequence having a length of 
at least 9 nucleotides which is complementary to, or hybridises under moderately stringent or 
stringent conditions to a pretreated genomic DNA according to one of the SEQ ID NO: 2 to 
SEQ ID NO: 5 and sequences complementary thereto, for detecting the c3^osine methylation 
state and/or single nucleotide polymorphisms (SNPs) within one of the sequences according 
to SEQ ID NO: 1, and sequences complementary thereto. 

Yet another aspect of the present invention relates to a method for manufacturing an arrange- 
ment of different oligomers (array) fixed to a carrier material for analysing diseases associated 
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with the methylation state of the CpG dinucleotides of one of SEQ ID NO: 1, and sequences 
complementary thereto wherein at least one oligomer, in particular an oligonucleotide or pep- 
tide nucleic acid (PNA)-oligomer, said oligomer comprising in each case at least one base 
sequence having a length of at least 9 nucleotides which is complementary to, or hybridises 
under moderately stringent or stringent conditions to a pretreated genomic DNA according to 
one of the SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary thereto, is coupled 
to a solid phase. 

Yet another aspect of the present invention relates to a composition of matter comprising the 
following: a) a nucleic acid comprising a sequence at least 1 8 bases in length of a segment of 
the chemically pretreated genomic DNA according to one of the sequences taken from the 
group comprising SEQ ID NO: I to SEQ ID NO: 5 and sequences complementary thereto, 
and b) a buffer comprising at least one of the following substances: 1 to 5 mM magnesitim 
chloride, 100-500 \iM dNTP, 0.5-5 units/1 Oul of taq pol3anerase, an oligomer, in particular an 
oligonucleotide or peptide nucleic acid (PNA)-oligomer, said oligomer comprising in each 
case at least one base sequence having a length of at least 9 nucleotides which is complemen- 
tary to, or hybridises under moderately stringent or stringent conditions to a pretreated geno- 
mic DNA according to one of the SEQ ID NO: 2 to SEQ ID NO: 5 and sequences comple- 
mentary thereto. 

Preferably, the gene PITX2, its promoter and/or regulatory elements is used for predicting the 
survival of patients diagnosed with a cell proliferative disease. Preferred is the use of the 
mRNA of the gene PITX2 for predicting the survival of patients diagnosed with a cell prolif- 
erative disease. 

Yet another aspect of the present invention relates to a method for predicting the survival of 
patients diagnosed with a. cell proliferative disease according to the present invention, com- 
prising: a) isolating or enriching genomic DNA from said biological sample ; b) treating the 
genomic DNA, or a fragment thereof, with one or more reagents to convert 5-position un- 
methylated cytosine bases to uracil or to another base that is detectably dissimilar to cytosine 
in terms of hybridisation properties; c) contacting the treated genomic DNA, or the treated 
fragment thereof, with an amplification enzyme and at least two primers comprising, in each 
case a contiguous sequence at least 1 8 nucleotides in length that is complementary to, or hy- 
bndises uMct 
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group consisting of SEQ ID NOS: 2 to 5, and complements thereof, wherein the treated DNA 
or a fragment thereof is either amplified to produce one or more ampliflcates, or is not ampli- 
fied; and d) determining, based on the presence or absence of, or on the quantity or on a prop- 
erty of said amplificate, the methylation state of at least one CpG dinucleotide sequence of 
SEQ ID NO: 1, or an average, or a value reflecting an average methylation state of a plurality 
of CpG dinucleotide sequences of SEQ ID NO: L 

Yet another aspect of the present invention relates to a method for detecting the survival of 
patients diagnosed with a cell proliferative disease of the breast according to the present in- 
vention, comprising the following steps of a) obtaining, firom a subject, a biological sample 
having subject genomic DNA; b) treating the genomic DNA, or a fragment thereof, with one 
or more reagents to convert 5-position unmethylated cytosine bases to uracil or to another 
base that is detectably dissimilar to cytosine in terms of hybridisation properties; c) amplify- 
ing one or more fragments of the treated DNA such that only DNA originating from breast or 
breast cell proliferative disorder cells are amplified and d) detecting the amplificates or char- 
acteristics thereof and lliereby deducing on the survival of patients diagnosed with a cell pro- 
liferative disease of the breast. 

ft 

Yet another aspect of the present invention relates to a use of an oligomer, an oligonucleotide 
or peptide nucleic acid (PNA)-oligomer, said oligomer comprising in each case at least one 
base sequence having a length of at least 9 nucleotides which is complementary to, or hybrid- 
ises under moderately stringent or stringent conditions to an artificially modified, chemically 
pretreated DNA according to one of the SEQ ID NO: 2 to SEQ ID NO: 5 and sequences com- 
plementary thereto, for differentiating or distinguishing between patients diagnosed with 
breast cancer, who have a good sxirvival prognosis and patients who have a bad survival prog- 
nosis. 

Yet another aspect of the present invention relates to a use of a nucleic acid comprising a se- 
quence of at least 1 8 bases in length of a segment of the artificially modified, chemically pre- 
treated, DNA according to one of the sequences taken from the group comprising SEQ ID 
NO: 2 to SEQ ID NO: 5 and sequences complementary thereto, for differentiating or distin- 
guishing between patients diagnosed with breast cancer, who have a good survival prognosis 
and patients who have a bad survival prognosis. Yet another aspect of the present invention 
relates to a use of an oligomer, an oligonucleotide or peptide nucleic acid (PNA)-oligomer, 
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said oligomer comprising in each case at least one base sequence having a length of at least 9 
nucleotides which is complementary to, or hybridises under moderately stringent or stringent 
conditions to an artificially modified, chemically pretreated DNA according to one of the 
SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary thereto, for prediction of sur- 
vival of a patient diagnosed with a cell proliferative disorder of the breast. 

Yet another aspect of the present invention relates to a use of a nucleic acid comprising a se- 
quence of at least 1 8 bases in length of a segment of the artificially modified, chemically pre- 
treated, DNA according to one of the sequences taken from the group comprising SEQ ID 
NO: 2 to SEQ ID NO: 5 and sequences complementary thereto, for prediction of survival of a 
patient diagnosed with a cell proliferative disorder of the breast. Yet another aspect of the 
present invention relates to a use of a nucleic acid represented by their SEQ ID NO out of the 
group of nucleic acids according to SEQ ID NOS : 6, 7, 8, 9, 10, 14, 15, 22, 23, 24, 25, 26, 
27, 28, 29, 30 and 31, for differentiating or distinguishing between patients diagnosed with 
breast cancer, who have a good survival prognosis and patients who have a bad survival prog- 
nosis. Yet another aspect of the present invention relates to a use of a nucleic acid represented 
by their SEQ ID NO out of the group of nucleic acids according to SEQ ID NOS : 6, 7, 8, 9, 
10, 14, 15, 22, 23, 24, 25, 26, 27, 28, 29, 30 and 31, for prediction of sxirvival of a patient di- 
agnosed with a cell proliferative disorder of the breast 

In the context of this invention the terms "obtaining a biological sample" or "obtaining a 
sample from a subjecf is supposed to comprise several different sources of such a sample, 
but always excludes the active retrieval of a sample from an individual patient, such as the 
performance of a biopsy. Included are the following examples: obtaining a sample, which was 
prior to the obtaining step taken from a patient in a biopsy or surgery, from a sample provider; 

•i 

obtaining a sample from a clinician, a surgeon or other medical personnel; obtaining a sample 
from a courier, who is bringing the sample from the clinician or practitioner or patient him- 
self, for example to the anal3^ic service station, as well as obtaining a sample such as a body 
fluid sample per post or from the hands of the patient himself This is not meant to be a limit- 
ing list, but shall illustrate that it is not a feature of the invention that it needs to be carried out 
on the patient itself. 

The term "biological material" relates to any material that is derived from a source, in par-l 
tieulaoQtrarrhnd^ and/or human soim;erthat~conlmns wis~sffi[q)ectedlo co^ 
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On example of a biological material according to the present invention will be a biological 
sample. 

In the context of the present invention, the term "CpG island" refers to a contiguous region of 
genomic DNA that satisfies the criteria of (1) having a frequency of CpG dinucleotides corre- 
sponding to an "Observed/Expected Ratio" >0.6, and (2) having a "GC Content" >0.5. CpG 
islands are t3q>ically, but not always, between about 0.2 to about 1 kb in length. 

In the context of the present invention the term "regulatory region" of a gene is taken to mean 
nucleotide sequences which affect the expression of a gene. Said regulatory regions may be 
located within, proximal or distal to said gene. Said regulatory regions include but are not 
limited to constitutive promoters, tissue-specific promoters, developmental-specific promot- 
ers, inducible promoters and the like. Promoter regulatory elements may also include certain 
enhancer sequence elements that control transcriptional or translational efiSciency of the gene. 

In the context of the present invention, the term "methylation" refers to the presence or ab- 
sence of 5-methylcytosine ("5-mCyt") at one or a plxirality of CpG dinucleotides wdthin a 

- , *. 

DNA sequence. 

In the context of the present invention the term "methylation state" is taken to mean the die- 
gree of methylation present in a nucleic acid of interest, this may be expressed in absolute or 
relative terms i.e. as a percentage or other numerical value or by comparison to another tissue 
and therein described as hypermethylated, hypomethylated or as having significantly similar 
or identical methylation status. 

In the context of the present invention, the term "hemi-methylation" or "hemimethylation" 
refers to the methylation state of a palindromic CpG methylation site, where only a single 
c5^osine in one of the two CpG dinucleotide sequences of the double stranded CpG methyla- 
tion site is methylated (e.g., 5'-NNC^GNN-3' (top strand): 3'-NNGCNN-5' (bottom 
strand)). 

In the context of the present invention, the term "hypermethylation" refers to the average 
methylation state corresponding to an increased presence of 5-mCyt at one or a plurality of 
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CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the amount of 5- 
mCyt found at corresponding CpG dinucleotides within a normal control DNA sample. 

In the context of the present invention, the term "hypomethylation" refers to the average 
methylation state corresponding to a decreased presence of 5-mCyt at one or a plurality of 
CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the amount of 5- 
mCyt found at corresponding CpG dinucleotides within a normal control DNA sample. 

In the context of the present invention, the term "microarray" refers broadly to both "DNA 
microarrays," and 'DNA chip(s),' as recognised in the art, encompasses all art-recognised 
solid supports, and encompasses all methods for affixing nucleic acid molecules thereto or 
synthesis of nucleic acids thereon. 

"Genetic parameters'' are mutations and polymorphisms of genes and sequences further re- 
quired for their regulation. To be designated as mutations are, in particular, insertions, dele- 
tions, point mutations, inversions and polymorphisms and, particularly preferred, SNPs (sin- 
gle nucleotide polymorphisms). 

"Epigenetic modifications" or "epigenetic parameters" are modifications of DNA bases of 
genomic DNA and sequences further required for their regulation, in particular, cytosine 
methylations thereof. Further epigenetic parameters include, for example, the acetylation of 
histones which, however, cannot be directly analysed using the described method but which, 
in tum, correlate with the DNA methylation. 

In the context of the present invention, the term "bisulfite reagent" refers to a reagent com- 
prising bisiilfite, disulfite, hydrogen sulfite or combinations thereof, useful as disclosed herein 
to distinguish between methylated and unmethylated CpG dinucleotide sequences. 

In the context of the present invention, the term "Methylation assay" refers to any assay for 
determining the methylation state of one or more CpG dinucleotide sequences within a se- 
quence of DNA, 

In the context of the present invention, the terai "MS.AP-PCR" (Methylation-Sensitive Arbi- 
frar^ily^Pnmed Folym 
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for a global scan of the genome using CG-rich primers to focus on the regions most likely to 
contain CpG dinucleotides, and described by Gonzalgo et al.. Cancer Research 57:594-599, 
1997. 

In the context of the present invention, the term "MethyLight" refers to the art-recognised 
fluorescence-based real-time PGR technique described by Eads et aL, Cancer Res. 59:2302- 
2306, 1999. 

In the context of the present invention, the term "HeavyMethyl™" assay, in the embodiment 
thereof implemented herein, refers to a HeavyMethyl™ MethylLight assay, which is a varia- 
tion of the MethylLight assay, wherein the MethylLight assay is combined with methylation 
specific blocking probes covering CpG positions between the amplification primers. 

The term "Ms-SNuPE'' (Methylation-sensitive Single Nucleotide Primer Extension) refers to 
the art-recognised assay described by Gonzalgo & Jones, Nucleic Acids Res. 25:2529-2531, 
1997. 

L 

In the context of the present invention the term "MSP" (Methylation-specific PGR) refers to 
the art-recognised methylation assay described by Herman et al. Proc. Natl Acad, ScL USA 
93:9821-9826, 1996, and by US Patent No. 5,786,146. 

In the context of the present invention the term "COBRA" (Combined Bisulfite Restriction 
Analysis) refers to the art-recognised methylation assay described by Xiong & Laird, Nucleic 
Acids Res. 25:2532-2534, 1997. 

In the context of the present invention the term "hybridisation" is to be understood as a bond 
of an oligonucleotide to a complementary sequence along the lines of the Watson-Crick base 
pairings in the sample DNA, forming a duplex structure. 

"Stringent hybridisation conditions," as defined herein, involve hybridising at 68°C in 5x 
SSC/5X Denhardt's solution/1.0% SDS, and washing in 0.2x SSC/0.1% SDS at room tem- 
perature, or involve the art-recognised equivalent thereof (e.g., conditions in which a hybridi- 
sation is carried out at 60°C in 2.5 x SSC buffer, followed by several washing steps at 37°C in 
a low buffer concentration, and remains stable). Moderately stringent conditions, as defined 
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herein, involve including washing in 3x SSC at 42''C, or the art-recognised equivalent thereof. 
The parameters of salt concentration and temperature can be varied to achieve the optimal 
level of identity between the probe and the target nucleic acid. Guidance regarding such con- 
ditions is available in the art, for example, by Sambrook et al., 1989, Molecular Cloning, A 
Laboratory Manual, Cold Spring Harbor Press, N.Y.; and Ausubel et al. (eds.), 1995, Current 
Protocols in Molecular Biology, (John Wiley & Sons, N.Y.) at Unit 2.10. 

"Background DNA" as used herein refers to any nucleic acids which originate from soiirces 
other than colon cells. 

In the context of this application "survival" is meant to describe the time from diagnosis or 
start of treatment to an endpoint, which may be either the time of death (considering any rea- 
son for death or only death from breast cancer), or the tune of recurrence of breast cancer (for 
example in form of metastases), which may be local or distant, or the time of occurrence of 
any breast cancer associated disease. Therefore "predicting the survival" is meant to comprise 
predicting the disease free survival, as well as the overall survival or any other consideration 
of time between diagnosis and endpoint of treatment. However, as it is obvious in the state of 
the art, a precise prediction of life time is generally impossible whether it is based on a bio- 
marker analysis, or any other prognostic tools, it is understood throughout the invention that 
said temi "prediction of survival" or "predicting the survival" is used to describe the risk of 
patient to suffer from a recurrence of metastasis or other disease caused by the original breast 
cell proliferative disease the patient was diagnosed with (also termed "risk of relapse"). Said 
risk can be predicted with a certain probability or likelihood. It is also clear, that predicting 
the survival is meant to comprise the determination of the likelihood or probability whether a 
subject or patient will survive for a longer or shorter period of time. 

Throughout this invention it is preferred that said survival is characterised as the disease free 
or the overall survival. It is especially preferred that survival is understood as disease free 
survival. Disease free sxurvival is understood as absence of recurrence of cancer (local or dis- 
tant). 

The terms "endocrine therapy" or "endocrine treatment" is meant to comprise any therapy, 
treatment or treatments targeting the estrogen receptor pathway or estrogen synthesis pathway 

or eWoipn" c^ 
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secretion. Said treatments include, but are not limited to estrogen receptor modulators, estro- 
gen receptor down-regulators, aromatase inhibitors, ovarian ablation, LHRH analogues and 
other centrally acting drugs influencing estrogen production. 

The term "monotherapy" is used to explain that no other treatment is given in addition or to 
support said monotherapy. 

In the context of the present invention the term "chemotherapy" is taken to mean the use of 
drugs or chemical substances to treat cancer. This definition excludes radiation therapy 
(treatment with high energy rays or particles), hormone therapy (treatment with hormones or 
hormone analogues (synthetic substitutes) and surgical treatment. 

In the context of the present invention the term "adjuvant treatment" is taken to mean a ther- 
apy of a cancer patient immediately following an initied non chemotherapeutical therapy, e.g. 
surgery. In general, the purpose of an adjuvant therapy is to provide a significantly smaller 
risk of recurrences compared without the adjuvant therapy. 

In the context of the present invention the term "determining a suitable treatment regimen for 
the subject" is taken to mean a treatment regimen (i.e. a single therapy or a combination of 
different therapies that are used for the prevention and/or treatment of the cancer in the pa- 
tient) for the cancer patient that is started, modified and/or ended based or essentially based or 
at least partially based on the results of the analysis according to the present invention. One 
example is starting an adjuvant endocrine therapy after surgery, another would be to modify 
the dosage of a particular chemotherapy. The determination can, in addition to the results of 
the analysis according to the present invention, be based on personal characteristics of the 
subject to be treated. In most cases, the actual determination of the suitable treatment regimen 
for the subject will be performed by the attending physician or doctor. 

While the present invention has been described with specificity in accordance with certain of 
its preferred embodiments, the following examples and figures serve only to illustrate the in- 
vention and is not intended to limit the invention within the principles and scope of the broad- 
est interpretations and equivalent configurations thereof. 

In the accompanying sequence protocol and the figures. 
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SEQ ID NO: 1 shows the nucleic acid sequence of the human gene PITX2, 

SEQ ID NOS: 2 to 5 show chemically pretreated nucleic acid sequences of the gene PITX2, 

according to table 1 . 

SEQ ID NOS: 6 to 9 show the nucleic acid sequences of those primers and probes useful to 
predict the survival of breast cancer patients according to the invention as described in exam- 
ple 4. 

SEQ ID NOS: 14 to 17 show the nucleic acid sequences of those primers and probes useful to 
predict the survival of bresist cancer patients according to the invention as described in exam- 
ple 5. 

SEQ ID NOS: 10 to 12 show the nucleic acid sequences of primers and probes according to a 
control gene used in the example 4 and 5. 

SEQ ID NO: 13 shows a subsequence of SEQ ID 1, which represents the nucleic acid se- 
quence of the human gene PITX2. 

SEQ ID NO: 18 shows an amino acid sequence of the polypeptide encoded by the gene 
PITX2. The amino acid sequence of the polypeptide encoded by the gene PITX2 is also illus- 
trated in figure 10. 

FIGURES 

Figure 1 presents a scheme to illustrate a preferred application of the method according to the 
invention. Along the Y axis tumour(s) mass (or size) increases, wherein the line '3' indicates 
the limit of detectability of said tumour mass. The X axis represents time (such as in life time 
of a patient). Accordingly said figure illustrates a simplified model of an Stage 1 -3 breast tu- 
mour wherein primary treatment was surgery (at point 1), followed by adjuvant therapy with 
Tamoxifen, as an example for an endocrine treatment. In a first scenario a patient without 
relapse during endocrine treatment (4) is shown as remaining below the limit of detectability 
for the duration of the observation. A patient with relapse of the cancer (5) has a period of 
disease firee survival (2) followed by relapse when the carcinoma mass reaches the level of 
detectability. 

Figure 2 shows the result of the assay (QM assay) as described in example 4: A Kaplan-Meier 
estimated metastasis-firee survival curve for 3 CpG sites of the PITX2 gene by means of Real- 
Time methylation specific probe analysis (QM assay). The lower curve shows the proportion 
of metastasis firee patients in the population with above median methylation levels, the upper I 
ciSW~sfiows"the"prop meestasts^ee "pMie^^ median 
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methylation levels. The X axis shows the metastasis free survival times of the patients in 
months, and the Y axis shows the proportion of metastasis free survival patients. 

Figure 3 shows the result of the chip hybridisation experiment as described in example 2. A 
Kaplan-Meier estimated metastasis-free survival curves for 2 CpG positions of the PITX2 
gene by means of methylation specific detection oligo hybridisation analysis. The lower curve 
shows the proportion of metastasis free patients in the population with above median meth- 
ylation levels, the upper curve shows the proportion of metastasis free patients in the popula- 
tion with below median methylation levels. The X axis shows the metastasis free survival 
times of the patients in months, and the Y axis shows the proportion of metastasis free sur- 
vival patients. 

Figure 4 shows the Kaplan-Meier estimated metastasis-free survival curves for 2 CpG posi- 
tions of the PITX2 gene by means of methylation specific detection oligo hybridisation analy- 
sis. The lower line shows the proportion of metastasis free patients in the population of 55 
patients with above median methylation levels, the upper curve shows the proportion of me- 
tastasis free patients in the population of 54 patients with below median methylation levels. 
TheX axis shows the metastasis free survival tunes of the patients in years, and the Y axis 
shows the proportion of metastasis free survival patients in %. This resulted from a first data 
set that was achieved in a first study. 

Figure 5 shows the Kaplan-Meier estimated metastasis-free survival curves for 6 different 
CpG positions located within the preferred region of the PITX2 gene (SEQ ID 13) by means 
of methylation specific detection oligo hybridisation analysis. The lower line shows the pro- 
portion of metastasis free patients in the population of 1 1 8 patients with above median meth- 
ylation levels, the upper curve shows the proportion of metastasis free patients in the popula- 
tion of 1 1 8 patients with below median methylation levels. The X axis shows the metastasis 
free survival times of the patients in years, and the Y axis shows the proportion of metastasis 
free survival patients in %, This resulted from a second data set that was achieved in a second 
study. 

Figure 6 shows the Kaplan-Meier estimated metastasis-free survival curves for 6 different 
CpG positions located within the preferred region of the PITX2 gene (SEQ ID 13) by means 
of methylation specific detection oligo hybridisation analysis. This time only a subpopulation 
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of 148 patients, characterised by a tumour at grade Gl or G2, was analysed: The lower curve 
shows the proportion of metastasis free patients in the population of 74 patients with above 
median methylation levels, the upper curve shows the proportion of metastasis free patients in 
the population of 74 patients with below median methylation levels. The X axis shows the 
metastasis free survival times of the patients in years, and the Y axis shows the proportion of 
metastasis free survival patients in %. This resulted from a second data set that was achieved 
in the second example. 

Figure 7 shows the Kaplan-Meier estimated metastasis-free survival curves for 4 different 
CpG positions located within the preferred region of the PITX2 gene (SEQ ID 13) by means 
of methylation specific detection oligo hybridisation analysis. This time a subpopulation of 
224 patients, characterised by a tumour of stage 1 or 2 (Tl or T2), was analysed: The lower 
curve shows the proportion of metastasis free patients in the population of 112 patients with 
above median methylation levels, the upper curve shows the proportion of metastasis free 
patients in the population of 1 12 patients vsdth below median methylation levels. The X axis 
shows the metastasis free survival times of the patients in years, and the Y axis shows the 
proportion of metastasis free survival patients in %. This resulted from the second data set 
that was achieved in the second example. 

Figure 8 shows the disease-free survival curves for a combination of two oligonucleotides 
each from the genes TBC1D3 and CDK6, and one oligonucleotide from the gene PITX2 cov- 
ering two CpG sites. The black curve shows the proportion of disease free patients in the 
population with above median methylation scores, the grey curve shows shows the proportion 
of disease free patients in the population with below median methylation scores. 

Figure 9 shows the plot according to Figure 8 and the classification of the sample set by 
means of the St. Gallen method. The unbroken lines represent the methylation analysis 
wherein the black curve shows the proportion of disease free patients in the population with 
above median methylation scores, the grey curve shows the proportion of disease free patients 
in the population wdth below median methylation scores. The broken lines represent the St. 
Gallen classification of the sample set wherein the black curve shows the disease free survival 
time of the high risk group and the grey curve shows the disease free survival of the low risk 
group. 
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Figure 10 illustrates the amino acid sequence of the polypeptide encoded by the gene PITX2. 

EXAMPLES 

EXAMPLE 1 : Study 1 

The first study was based on a population of 109 patients, comprising patients of both nodal 
statuses NO and N+. All patients were ER+ (estrogen receptor positive). All patients received 
Tamoxifen monotherapy immediately after surgery or diagnosis. The samples were analysed 
using Epigenomics' chip technology with two chip panels representing altogether 1 1 7 candi- 
date genes. For further details see patent application WO 04/035803 and EP 03 090 432.0, 
which is hereby incorporated by reference. In this study one significant marker gene was 
found. The methylation status of PITX2, coding for a transcription factor, was correlated sta- 
tistically significant with disease-firee survival under adjuvant Tamoxifen treatment. A Cox 
regression model that includes the nodal status of the patient at the tune of diagnosis was ap- 
plied. 

The result from this study - with respect to PITX2 - is illustrated in Figure 4. The X axis 
shows the metastasis free survival times of the patients in years, and the Y axis shows the 
proportion of metastasis free survival patients in %. Amongst the 54 patients (upper line) with 
below median methylation levels a higher percentage has a significantly longer metastasis 
free survival time, than amongst the 55 patients with above median methylation levels (lower 
Ime). To illustrate the result: At time of 10 years after surgery under tamoxifen monotherapy, 
more than 75% of the patients with low methylation in PITX2 are still metastasis free, 
whereas less than 60% of the patients with high methylation in PITX2. 

As the survival of a breast cancer patient is known to also be correlated to the patient's nodal 
status, the differentiatuig power of the marker m this mixed population is expected to be less 
than in a homogenous population. 

Another study was performed to analyse whether the same marker can be identified independ- 
ently, in a completely different set of patient samples and also to characterise the differential 
power towards predicting survival for a sub-group of patients, all being NO. 



EXAMPLE 2 : Study 2 
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The second study was based on samples from 236 patients from 5 different centres, wherein 
all patients were NO (nodal status negativ), and older than 35 years. In all cases the surgery 
was performed before 1998. All patients were ER-f (estrogen receptor positive), and the tu- 
mours were graded to be T 1-3, G 1-3. In this study all patients received Tamoxifen dfrectly 
after surgery, and the outcome was assessed as the length of disease-free survival. In order to 
be as representative as possible for the final target group, the patients and their tumor samples 
had to fulfil the following criteria: 

The range and median follow-up of patients were the following: 

Median: 64.5 months 

Range: 3 months to 142 months 

(calculated based on patients who were disease-free at end of observation time). 
Analysis of the methylation patterns of patient samples treated with Tamoxifen as an adjuvant 
therapy immediately following surgery (see Figure 1) is shown in the plots according to Fig- 
ures 5 to 7. For the amplificate, the mean methylation over 4 ohgo-pairs for that amplificate 
was calculated and the popvdation split into groups according to their mean methylation val- 
ues, wherein one group was composed of individuals with a methylation score higher than the 
median and a second group composed of individuals with a methylation score lower than the 
median. 

The primer oligonucleotides used to generate the amplificate, that was analysed in the array 
experiment were these : 

Array Primer PITX2_Q21: GTAGGGGAGGGAAGTAGATGT (SEQ ID 22) 
Array Primer PITX2_R23: TCCTCAACTCTACAAACCTAAAA (SEQ ID 23) 
The according genomic region of said amplificate is given in SEQ ID 13. 

The sequences of the oligonucleotides used in this array experiment were the following: 

SEQ ID NO 24 : AGTCGGGAGAGCGAAA 

SEQ ID NO 25 : AGTTGGGAGAGTGAAA 

SEQ ID NO 26 : AAGAGTCGGGAGTCGGA 

SEQ ID NO 27:_AAGAGTTGGGAGTTGGA 

SEQ ID NO 28: GGTCGAAGAGTCGGGA 

SEQ ID NO 29: GGTTGAAGAGTTGGGA 

SEQ ID NO 30: ATGTTAGCGGGTCGAA 

"SEQTOJTro^l : TAGTGGGTTaAAGAOT 
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When the data derived from analysing 6 different CpG sites, located within the preferred am- 
plified region of the PITX2 gene by means of methylation specific detection oligo hybridisa- 
tion analysis were plotted as Kaplan-Meier estimated metastasis-fi-ee survival curves, it can be 
seen that the differential power of the marker PITX2 increased with selecting for NO patients. 
This is shown in figures 5 to 7. The X axis shows the metastasis fi-ee survival times of the 
patients in years, and the Y axis shows the proportion of metastasis fi-ee survival patients in 
%. The lower curve shows the proportion of metastasis free patients in the population with 
above median methylation levels, and the upper curve shows the proportion of metastasis free 
patients in the population with below median methylation levels. 

For example, as illustrated in figure 5, 10 years after surgery only about 65% of the patients 
of the 118 patients with the higher methylation status are metastasis free, whereas about 90% 
of the 11 8 patients with lower methylation status are metastasis free. 

As illustrated in figure 6 when looking at the analogous Kaplan-Meier analysis for a sub- 
population of 148 patients, characterized by a tumour at stage Gl or G2 this differential 
power increases again: 10 years after surgery only about 60% of the 74 patients with the 
higher methylation status are metastasis free, whereas about 95% of the 74 patients with 
lower methylation status are metastasis free. 

Figure 7 illustrates how the survival is also correlated to the tumour stage at surgery by 
showmg the analogous Kaplan-Meier analysis for a subpopulation of 150 patients, character- 
ised by a tumour stage of Tl or T2: The number of patients with 10 years MFS is about 68% 
of patients of the 112 with the higher methylation status, whereas about 95% of the 112 pa- 
tients with lower methylation status are metastasis &ee. 

EXAMPLE 3: 

The accuracy of the differentiation between the different groups was fiirther increased by 
combining multiple oligonucleotides from different genes. As described in the text it was rec- 
ognised that adding additional informative markers to the analysis could potentially increase 
the prognostic power of a survival test. Therefore it was calculated how a combination of two 
methylation specific oligonucleotides each from the genes TBC1D3 and CDK6, and one oli- 
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gonucieotide from the gene PITX2 would differentiate the groups of good or bad prognosis. 
The result is shown in figure 8 as the according Kaplan-Meier curve. 

Figure 9 shows -on top of Figure 8- the classification of the patients from the sample set by 
means of the St.Gallen method (the current method of choice for estimating disease free sur- 
vival), thereby showing the improved effectiveness of methylation analysis over current 
methods, in particular post 80 months. 

EXAMPLE 4: Real time Quantitative methylation analysis 

Genomic DNA was analysed using the Real Time PGR technique after bisulfite conversion. 
In this analysis four oligonucleotides were used in each reaction. Two non methylation spe- 
cific PGR primers were used to amplify a segment of the treated genomic DNA containing a 
methylation variable oligonucleotide probe binding site. Two oligonucleotide probes com- 
petitively hybridise to the binding site, one specific for the methylated version of the binding I 
site, the other specific to the unmethlyated version of the binding site. Accordingly, one of the 
probes comprises a GpG at the methylation variable position (i.e. anneals to methylated bisul- 
phite treated sites) and the other comprises a TpG at said position (i.e. anneals to unmethyl- 
ated bisulphite treated sites). Each species of probe is labelled with a 5' fluorescent reporter 
dye and a 3* quencher dye whereui the CpG and TpG oligonucleotides are labelled with dif- 
ferent dyes. 

The reactions are calibrated by reference to DNA standards of known methylation levels in 
order to quantify the levels of methylation within the sample. The DNA standards were com- 
posed of bisulfite treated phi29 amplified genomic DNA (i.e. unmethlyated), and/or phi29 
amplified genomic DNA treated with Sssl methylase enzyme (thereby methylatmg each CpG 
position m the sample), which is then treated with bisulfite solution. Seven different reference 
standards were used with 0%, (i.e. phi29 amplified genomic DNA only), 5%, 10%, 25%, 
50%, 75% and 100% (i.e. phi29 Sssl treated genomic only). 

The amount of sample DNA amplified is quantified by reference to the gene (B-actin 
(ACTB)) to normalise for mput DNA. For standardisation the primers and the probe for 
analysis of the AGTB gene lack CpG dinucleotides so that amplification is possible regardless 
of methylation levels. As there are no methylation variable positions, only one probe oligonu- 
cleotide islreqmi^ 
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The following oligonucleotides were used in the reaction to amplify the control amplificate: 

Control Primerl: TGGTGATGGAGGAGGTTTAGTAAGT (SEQ ID NO: 10) 

Control Primer2: AACCAATAAAACCTACTCCTCCCTTAA (SEQ ID NO: 1 1) 

Control Probe: 6FAM-ACCACCACCCAACACACAATAACAAACACA-TAMRA or Dab- 

cyl (SEQ ID NO: 12) 

The nucleic acid sequence of the gene PITX2 is given in (SEQ ID 1), after treatment with 
bisulfite two different strands are generated, and each of the strands is represented twice, once 
in a prior to treatment methylated version (SEQ ID 2 and 3) and once in the prior to treatment 
immethylated form (SEQ ID 4 and 5), which are characterised as containing no cytosine bases 
(despite of those 5' adjacent to a guanine and methylated before treatment). 
The following primers are used to generate an amplificate within the PITX2 sequence com- 
prising the CpG sites of interest: 

Primers for PITX bisulfite amplificate length : 144 bp 
PITX2R02: GTAGGGGAGGGAAGTAGATGTT (SEQ ID NO: 6) 
PITX2Q02: TTCTAATCCTCCTTTCCACAATAA (SEQ ID NO: 7) 

The genomic region according to the generated amplifacte of 144 bp in length is given in SEQ 
ID NO 18, 

Probes: 

PITX2cgl: FAM-AGTCGGAGTCGGGAGAGCGA-Darquencher (SEQ ID NO: 8) 
As an alternative quencher TAMRA was also used in additional experiments: 
FAM-AGTCGGAGTCGGGAGAGCGA-TAMRA 

PITX2tgl : YAKIMA YELLOW-AGTTGGAGTTGGGAGAGTGAAAGGAGA- 

Darquencher (SEQ ID NO: 9) 

In additional experiments we also used : 

VIC- AGTTGGAGTTGGGAGAGTGAAAGGAGA -TAMRA 

The extent of methy lation at a specific locus was determined by the following formula: 

methylation rate= 100 * I (CG) / (I(CG) + I(TG)) 

(I = Intensity of the fluorescence of CG-probe or TG-probe) 

PCR components were ordered from Eurogentec : 
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3 mM MgC12 buffer, lOx buffer, Hotstart TAQ 

Program (45 cycles): 95 10 min; 95 °C, 15 sec; 62 ""C, 1 min 

This assay was performed on 236 samples identical to those used in Example 2. The result is 
shown in figure 2. Figure 2 shows the Kaplan-Meier estimated disease-j&ee survival curves 
for 3 CpG positions of the PITX2 gene by means of Real-Time methylation specific probe 
analysis, as described above. The lower curve shows the proportion of disease free patients in 
the population with above median methylation levels, the upper curve shows the proportion of 
disease free patients in the population with below median methylation levels. The X axis 
shows the disease free survival times of the patients in months, and the Y- axis shows the 
proportion of disease free survival patients. The p-value (probability that the observed distri- 
bution occurred by chance) was calculated as 0.0031, thereby confirming the data obtained by 
means of array analysis. 

For comparison, figure 3 illustrates the result from the array analysis of said gene, according 
to the chip hybridisation experiment described in Example 2, wherein detection oligos were 
used (for details see EP 03 090 432.0, which is incorporated by reference). The p-value (prob- 
ability that the observed distribution occurred by chance) was calculated as 0.001 1 . 

EXAMPLES 

Another QM assay was developed in our hands, which also performed very well. The fol- 
lowing PITX2 specific oligonucleotides were employed to generate an amplificate of 164 bp. 
The oligonucleotides are specific for three co-methylated CpG positions. : 
Primers for PITX2 bisulfite amplificate with a length of 162 bp : 
PITX202: AACATCTACTTCCCTCCCCTAC (SEQ ID NO: 14) 
PITX2P3: GTTAGTAGAGATTTTATTAAATTTTATTGTAT (SEQ ID NO: 15) 
The genomic region according to the generated amplifacte of 162 bp in length is given in SEQ 
ID NO 19. 

Probes (from ABI): 

PITX2-IIcgl: FAM-TTCGGTTGCGCGGT-MGBNQF (SEQ ID NO: 16) 
PITX2-IItgl: VIC-TTTGGTTGTGTGGTTG- MGBNQF (SEQ ID NO: 17) 



The extent of methylation at a specific locus was determined by the following formula: 
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methylation rate = 1 00 * I (CG) / (I(CG) + I(TG)) 

(I = Intensity of the fluorescence of CG-probe or TG-probe) 

PGR components were ordered from Emogentec : 2,5 mM MgC12 bufifer, lOx buffer, Hotstart 
TAQ 



Program (45 cycles): 95 °C, 10 min; 95 °C, 15 sec; 60 °C, 1 min 
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CLAIMS 

1 . A method for characterising a cell proliferative disorder of the breast tissues and/or pre- 
dicting the survival of a patient diagnosed with said disorder, comprising the steps of: 

(a) detecting the expression of a nucleic acid or a polypeptide expressed from the PITX2 
gene in an isolated biological sample representative of said cell proliferative disorders 
of the breast tissues and 

(b) therefrom predicting the survival of said patient, characteristics of said cell prolifera- 
tive disorder, and/or prognosis of said patient. 

2. The method according to claim 1 further comprising 

(c) determining a suitable treatment regimen for the subject. 

3. The method of claim 1, w^herein said patient is characterised by being subject to adjuvant 
endocrine therapy comprising one or more treatments which target the estrogen receptor 
pathway or are involved in estrogen metabolism, production or secretion. 

T 

4. The method of claim 1 , wherein said breast cell proliferative disorders are taken from the 
group comprising ductal carcinoma in situ^ invasive ductal carcinoma, invasive lobular carci- 
noma, lobular carcinoma in situ, comedocarcinoma, inflammatory carcinoma, mucinous car- 
cinoma, scirrhous carcinoma, colloid carcinoma, tubular carcinoma, medullary carcinoma, 
metaplastic carcuioma, and papillary carcinoma and papillary carcinoma in situ, undifferenti- 
ated or anaplastic carcinoma and Paget' s disease of the breast. 

5. The method according to claim 1 characterised in that the detection is carried out by 

a) contacting said biological sample wdth an antibody immunoreactive with the PITX2 
polypeptide to form an immimocomplex; 

b) detecting said imrnxmocomplex; and 

c) predicting therefrom the survival of said patient, characteristics of said cell prolifera- 
tive disorder, and/or prognosis of said patient. 



6. The method according to claim 1 characterised in that the detection is carried out by 
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a) contacting said biological sample with an antibody immunoreactive with the PITX2 
polypeptide to form an immunocomplex; 

b) detecting said inmiimocomplex; 

c) therefrom predicting the survival of said patient, characteristics of said cell prolifera- 
tive disorder, and/or prognosis of said patient, and 

d) comparing the quantity of said immunocomplex to the quantity of immimocomplex 
formed under identical conditions with the same antibody and a control sample from one 
or more patients with a known prognosis. 

7. The method according to claim 1 characterised in that the detection is carried out by 

a) contacting said biological sample with an antibody immimoreactive with the PITX2 
polypeptide to form an immunocomplex; 

b) detecting said immimocomplex; and 

c) therefrom predicting the survival of said patient, characteristics of said cell prolifera- 
tive disorder, and/or prognosis of said patient, 

and wherein an increase in quantity of said immimocomplex in the sample from said 
subject relative to said control sample is indicative of a bad prognosis. 

8. The method of claim 5, wherein said immunoassay is a radioimmimoassay or an ELISA or 
a Westem blot, 

9. The method of claim 1, wherein said detection is afforded by mRNA expression analysis. 

10. The method of claim 9, comprising detecting the level of mRNA encoding a PITX2 poly- 
peptide in a biological sample from a patient, 

1 1. A method according to claim 10, wherein a increased concentration of said mRNA above 
the concentration determined for an individual known to have a good prognosis indicates a 
bad prognosis. 

12. The method of claim 10, comprising the steps of: 

(a) providing a polynucleotide probe which specifically hybridises or is identical to a polynu- 
cleotide consisting of SEQ ID NO: 19 or SEQ ID NO: 1, 
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(b) incubating said sample with said polynucleotide probe under high stringency conditions to 
form a specific hybridisation complex between a nucleic acid and said probe; 

(c) detecting said hybridisation complex. 

13. The method according to claim 12 wherein said nucleic acid is mRNA or a cDNA derived 
therefrom. 



1^. ine memoa accordmg to claim 13 wherem the detecting step further comprises the steps 
of: 

a) producing a cDNA from mRNA in the sample; 

b) providing two oligonucleotides which specifically hybridise to regions flanking a segment 
ofthecDNA; 

c) perforaiing a polymerase chain reaction on the cDNA of step a) using the oligonucleotides 
of step b) as primers to amplify the cDNA segment; and 

d) detecting the amplified cDNA segment, 

15. Use of a polypeptide expressed from the PITX2 gene for differentiating or distinguishing 
between patients diagnosed with breast cancer, who have a good survival prognosis and pa- 
tients who have a bad survival prognosis. 

16. Use of a polypeptide expressed from the PITX2 gene for prediction of survival of a pa- 
tient diagnosed with a cell proliferative disorder of the breast. 

17. The method of claim 1 wherein said detection comprises determining the genetic parame- 
ters of the gene PITX2, its promoter and/or regulatory elements. 

18. The method of claim 1 wherein said detection comprises detemiining the epigenetic pa- 
rameters of the gene PITX2, its promoter and/or regulatory elements. 

19. The method of claim 1, wherein said detection comprises determining the methylation 
status of one or more CpG positions of a target nucleic acid within the gene PITX2, its pro- 
moter and/or regulatory elements, in particular through the methylation analysis of a genomic 
DNA sequence according to SEQ ID NO: 1. 
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20. The method of claim 1, wherein said detection comprises determining the methylation 
status of one or more CpG positions of a target nucleic acid characterised as being identical to 
or hybridising under stringent or moderately stringent conditions to a sequence out of the 
group of SEQ ID NOs 13, 18 and 19. 

21. The method of claim 19, wherein the methylation analysis is afforded by contacting said 
target nucleic acid with one or more agents that convert cytosine bases that are unmethylated 
at the 5'-position thereof to a base that is detectably dissimilar to cytosine in terms of hybridi- 
sation properties. 

22. The method of claim 21, wherein contacting said target nucleic acids with one or more 
agents comprises use of a solution selected from the group consisting of bisulfite, hydrogen 
sulfite, disulfite, and combinations thereof. 

23. Use of a set of oligomer probes comprising at least two oligomers, in particular an oligo- 
nucleotide or peptide nucleic acid (PNA)-oligomer, said oligomer comprising in each case at 
least one base sequence having a length of at least 9 nucleotides which is complementary to, 
or hybridises under moderately stringent or stringent conditions to a pretreated genomic DNA 
according to one of the SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary 
thereto, for detecting the cytosine methylation state and/or single nucleotide polymorphisms 
(SNPs) within one of the sequences according to SEQ ID NO: 1, and sequences complemen- 
tary thereto. 

24. A method for manufacturing an arrangement of different oligomers (array) fixed to a car- 
rier material for analysing diseases associated with the methylation state of the CpG dinu- 
cleotides of one of SEQ ID NO: 1, and sequences complementary thereto wherein at least one 
oligomer, in particular an oligonucleotide or peptide nucleic acid (PNA)-oligomer, said oli- 
gomer comprising in each case at least one base sequence having a length of at least 9 nu- 
cleotides which is complementary to, or hybridises under moderately stringent or stringent 
conditions to a pretreated genomic DNA according to one of the SEQ ID NO: 2 to SEQ ID 
NO: 5 and sequences complementary thereto, is coupled to a solid phase. 

25. A composition of matter comprising the following: 

"^ ^ucleic "acid- com prising ^-sequence-at-least-t8- bases in-lengtii-of a- segnent-of tihe^ 
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chemically pretreated genomic DNA according to one of the sequences taken from the 
group comprising SEQ ID NO: 1 to SEQ ID NO: 5 and sequences complementary 
thereto, and 

a buffer comprising at least one of the following substances: 1 to 5 niM Magnesium 
Chloride, 100-500 nM dNTP, 0.5-5 units/1 Oul of taq polymerase, an oligomer, in par- 
ticular an oligonucleotide or peptide nucleic acid (PNA)-oligomer, said oligomer 
comprising m each case at least one base sequence having a length of at least 9 nu- 
cleotides which is complementary to, or hybridises under moderately stringent or 
stringent conditions to a pretreated genomic DNA according to one of the SEQ ID 
NO: 2 to SEQ ID NO: 5 and sequences complementary thereto. 



26. Use of the gene PITX2, its promoter and/or regulatory elements for predicting the survival 
of patients diagnosed with a cell proliferative disease. 

27. Use of the mRNA of the gene PITX2 for predictmg the survival of patients diagnosed 
with a cell proliferative disease. 

28. A method for predicting the survival of patients diagnosed with a cell proliferative disease 
according to claim 19, comprising: 

a) isolatmg or enriching genomic DNA from said biological sample ; 

b) treating the genomic DNA, or a fragment thereof, with one or more reagents to 
convert 5-position unmethylated cytosine bases to uracil or to another base that is de- 
tectably dissimilar to cytosine in terms of hybridisation properties; 

c) contacting the treated genomic DNA, or the treated fragment thereof, with an ampli- 
fication enzyme and at least two primers comprising, in each case a contiguous se- 
quence at least 1 8 nucleotides m length that is complementary to, or hybridises under 
moderately stringent or stringent conditions to a sequence selected from the group 
consistmg of SEQ ID NOS: 2 to 5, and complements thereof, wherein the treated 
DNA or a fragment thereof is either amplified to produce one or more amplificates, or 
is not amplified; and 

d) determining, based on the presence or absence of, or on the quantity or on a prop- 
erty of said amplificate, the methylation state of at least one CpG dinucleotide se- 
quence of SEQ ID NO: 1, or an average, or a value reflecting an average methylation 
state of a plurality of CpG dinucleotide sequences of SEQ ID NO: 1. 
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29. A method for detecting the survival of patients diagnosed with a cell proliferative disease 
of the breast according to claim 19, comprising the following steps of 

a) obtaining, from a subject, a biological sample having subject genomic DNA; 

b) treating the genomic DNA, or a fragment thereof, with one or more reagents to 
convert 5-position unmelhylated cytosine bases to uracil or to another base that is de- 
tectably dissimilar to cytosine in terms of hybridisation properties; 

c) amplifymg one or more fragments of the treated DNA such that only DNA origi- 
nating from breast or breast cell proliferative disorder cells are amplified 

d) detecting the amplificates or characteristics thereof and thereby deducing on the 
survival of patients diagnosed with a cell proliferative disease of the breast. 

30. Use of an oligomer, an oligonucleotide or peptide nucleic acid (PNA)-oligomer, said oli- 
gomer comprising in each case at least one base sequence having a length of at least 9 nu- 
cleotides which is complementary to, or hybridises under moderately stringent or stringent 
conditions to an artificially modified, chemically pretreated DNA accorduig to one of the 
SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary thereto, for differentiatuig or 
distinguishing between patients diagnosed with breast cancer, who have a good survival 
prognosis and patients who have a bad survival prognosis. 

31. Use of a nucleic acid comprising a sequence of at least 18 bases in length of a segment of 
the artificially modified, chemically pretreated, DNA according to one of the sequences taken 
from the group comprising SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary 
thereto, for differentiating or distinguishing between patients diagnosed with breast cancer, 
who have a good survival prognosis and patients who have a bad survival prognosis. 

32. Use of an oligomer, an oligonucleotide or peptide nucleic acid (PNA)-oligomer, said oli- 
gomer comprising in each case at least one base sequence having a length of at least 9 nu- 
cleotides which is complementary to, or hybridises under moderately stringent or strmgent 
conditions to an artificially modified, chemically pretreated DNA according to one of the 

SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary thereto, for prediction of sur- 
vival of a patient diagnosed with a cell proliferative disorder of the breast. 
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33. Use of a nucleic acid comprising a sequence of at least 18 bases in length of a segment of 
the artificially modified, chemically pretreated, DNA according to one of the sequences taken 
from the group comprising SEQ ID NO: 2 to SEQ ID NO: 5 and sequences complementary 
thereto, for prediction of survival of a patient diagnosed with a cell proliferative disorder of 
the breast. 

34. Use of a nucleic acid represented by their SEQ ID NO out of the group of nucleic acids 
according to SEQ ID NOS : 6, 7, 8, 9, 10, 14, 15, 22, 23, 24, 25, 26, 27, 28, 29, 30 and 31, for 
differentiating or distinguishing between patients diagnosed with breast cancer, who have a 
good survival prognosis and patients who have a bad survival prognosis. 

35. Use of a nucleic acid represented by their SEQ ID NO out of the group of nucleic acids 
according to SEQ ID NOS : 6, 7, 8, 9, 10, 14, 15, 22, 23, 24, 25, 26, 27, 28, 29, 30 and 31, for 
prediction of survival of a patient diagnosed with a cell proliferative disorder of the breast. 
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Abstract Q9 
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The present invention relates to methods for predicting the survival of a human being diag- 
nosed with a cell proliferative disorder of the breast tissues, characterised by a step of deter 
mining the expression level of PITX2 or the genetic or the epigenetic modifications of th< 
genomic DNA associated with the gene PITX2. The invention also relates to sequences, oli 
gonucleotides and antibodies which can be used within the described methods. 
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Sequence listing 



<110> Epigenomics AG 
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<120> PITX2 - a marker to predict survival of patients diagnosed with 
breast cell proliferative disease 

<160> 12 

<210> 1 

<211> 9001 

<212> DNA 

<213> Homo Sapiens 

<400> 1 



agctgatgga cttgctaaat ttctttcttc ttttttcttt ttcatattat ttgctagcca 
taatggaatc ctctaggttt aagccaaaga aaaattggag agacaaaatt agattttgta 
gcccttttcc cccccgggaa tgcctttttt tttcttttta gtttctgatg aatggctatc 
atttatttct accaaattta aataaggact gctgccttgt atgtttaact aggcaggcag 
agggaactgg tttgtttagg aagcagtgac tgagatgtcc tggccaagtt agtgacagag 
gaggggagaa agaatccaga ccaatttgta tgcagtatat tttactccca tgaaataaaa 
cacatttgtt tcatatttgc tgaaaagtaa aacaataata ttgtacgaaa tgttatacac 
agggtaggtt gtacatagca gtttcagaaa catcattgca tccaccagag aaactattct 
aaaactgata ttcacacatt ttttataata ataataatat gttagaaaca tacagtgtgg 
catttagtat atacactccc ttgctcgcaa gcgaaaaatc ctaatcgctt ctgtataaca 
tgctttattt taaagcctaa cctttaaaaa cactgttgtg atattactaa caactgcttt 
tataaaatta atttgacatt tcgatatata tacatccttt cagtcattta aatgttaaca 
atgctaaact taaaaaataa caagcttata gtaatgttaa aatgtcatat ccagtcaaac 
atttgtttgt gtatgtgtcc ttgcaactgt tagaaatact tgtagtgaaa gatgtcagac 
actgaggaca tccctttgaa atcaaaggag ctctctcttt gattcagtgg tttccttttc 
tctatatagc ttctctttct ctccctttct ttagtgccca cgaccttcta gcataattcc 
cagtctttca agggcggagt tgccccatcc ggcaaggtcc taggatcccg gcgctgtggg 
tgcggctcac acgggccggt ccactgcata ctggcaagca ctcaggttgg aggccgggtt 
ctgcacgctg gcgtagccga agctggagtg ctgctttgct ttcagtctca ggctggccag 
gctcgagtta cacgtgtccc tataaacata cggaggagtc ggcggcgcgt aaggacaggc 
aggcgtcggc accgcggaat tcagcgacgg gctactcagg ttgttcaagt tattcaggct 
gttgagactg gagcccggga cgcctgtcac tgctgagggc accatgctgg acgacatgct 
catggacgag atagagttgg gtggggaaaa catgctctgt gatgacaggg ggttgacgtt 
catagagttg aagaagggga agctcttggt ggatagggag gcggatgtaa ggcccttggc 
ggcccagttg ttgtaggaat agcctgggta catgtcgtcg tagggctgca tgagcccatt 
gaactgcggc ccgaagccat tcttgcatag ctcggcctgc tggttgcgct ccctctttct 
ccatttggcc cgacgattct tgaaccaaac ctgggggcgg ttggggcaag ggagcaaaca 
gatgccacag tgcagattac taaaacttcc atcggaggcc aacccccgcc ttcccccgac 
acacacgcta gcgcactcac acaccctggc ctcgcttcac tgcaccgccc tgcacaccaa 
gataccaggg ccagctttca gttactggcc cgggtctcca ccaagcgcag gagacctggt 
ctgctctggc ctgcgagctg ggactcggag ctacgccaca aacctcagcc gaacgcatgg 
agacctgcgg acggtttgat cactcagcca ggcgtttctc caggtccaaa aacacttaat 
gtaaaacaaa cgcggggcag caggcttttc caacccttcc cggggcacct tgcaaacttg 
cttccattcc aaagccacag acccacggat gaggagaagg ggctggaagg gcactagagg 
atcgctcttt ctcccacgca attcctccct tccttccctg acctccactg tcgtccccca 
ccccctggta cgtgctccct taacagggac taggccgcca acactctttc tcgcctagca 
aaacaaccaa ataaagagca aaagaccacc tcttcgtcag ctcgttaact ccaggagctt 
ggcatattaa actccgggaa cccggaaagg gtagttttgg agattccccc ttctttcgct 
ctgcctcttc tttaccctaa gcccaccaca ggcctgtccg cgcgccaggc ccagccgggt 
cgtttggctt tgcaggcggc cacccaggcc ggccggcttc cacccgtgtc cggtggccca 
gccgcaaccc cgatcccaat ccacatcggg cctccctgtc gccccagacg gcggcttttg 
tgtattggag agaggcctgg cctgagatat ccgagctgac accagtgatg tttcacatta 
cacatctccg ccgggcccag ccgtgtaatc cgctttttct ctttttcctt tcattcttga 
tttccttttt atcccccttc ctctttgcac ccgactgcta taaaaagcac gcctcactcc 
cacttggctc gacaagcagc cgccctggaa ggagaggcag ctgcaaggag agcccagcgc 
cgcggctaca aagcactagg gtggagctgc ggaatagcgg gcggggtggg agggcgtttt 
cgaaggatcc cagaaaaccc atagactctg tctttaatta cttgccattt ctaccctagg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
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ccatctaaac tttgctcagg 
gagagctaat gaaagactga 
ctggacagtt aaactaaaac 
cctcatgcct tgcacaaatg 
cattttggtc tttattcttt 
gggggcttta ttttttctac 
ggatgccgct tgatttgctt 
tcttccaccg accagcataa 
ttgccaatcc ataaagtaca 
atgattataa tttagaagag 
aagataaata aaccaagcag 
ctagtagttg tgtattaaaa 
tgggagagac ctgaatggcc 
agggaaaaag ctgggccagg 
gtctctggcg ggttttcctt 
cctggtaggg attttattag 
tgcaggggat tgcccatgca 
ggtgccgctc tgggagcctg 
taatctcgca accaggccgc 
gccctcccag gcggcgctgc 
caacatcctc cccccatccc 
cgtaaggttg gtccacacag 
ctggaaagtg gcctccagct 
ttgccgcttc ttcttagacg 
tttatctttc tctgaaaacg 
cggacgccaa ctggaccggc 
gagggtgcgc gcggcggctc 
caagtgtggg tgtccgcgcc 
atgtttatct cactgcagcg 
tacacctctg cgcagacaca 
cccccctccc cctcgcagaa 
tgtggggatt cggttgggca 
gcctcttcgc gaagggcatt 
tcgggagctg aaagccgaga 
gaatcaagat gctgggattt 
gagagggcgg cagaattgct 
aactaaaggg atgcggggta 
cgaacgacca ctcccaccac 
agcgggtggc gcagagcagc 
cggcctccgg gctggaggtg 
ccagcgaccg gggctgaccg 
gacaggaaga tgaggagacg 
gcgggccttt catgcagttc 
tgccccgccg agcccccgta 
tttccccgct gcggggagag 
cccccgtctc ctccggaagg 
acgaagacat gggatgtggg 
ccaagctcca aagcgaaaca 
caagaacccc tccaataagg 
gcggcagccc tgacagagaa 
cctgcggcgg cagcagccgc 
gcgcacacgc acacccctcg 
caggcaccca ggcgagcgac 
atcttaaaac cagaggcggg 
agccctctcc gcctccgcct 
caccaatcag gacgccccga 
tgcggggccg ggctactgcc 
attcagctcc tgcccagccc 
atgaggctag gagtcgaagg 
caaggatcaa caagacaccc 
— e-oa-ta feaa^ a~a.ggag a cacg 
cctctccgat cttaaatttt 
agctttaatt gcaaagaaga 



cgagaagagt 
ccttgctcaa 
cattttcaac 
ccacccagag 
ttatcgttgt 
ccagagcact 
gattctgttt 
accaggacgt 
gatttgctac 
ggggtgtgag 
aaaagtcttt 
ctttgctccc 
gaaacaaccg 
gccgggacaa 
gttaaaggct 
ctctgctctg 
gcccagct eg 
agccagggcg 
cgcgaggcct 
cttccacatt 
tcccagactc 
cgatttcttc 
cctggagctg 
ggtcctcggc 
aaacacacac 
ggcagaagcc 
cgggccgcga 
ccatttcctc 
gcacattcac 
ccaaatctcc 
agctcagatt 
ccgaagttcg 
tctgagtggt 
ggaaaacagg 
ttgtgaccca 
cgcgccctta 
gtcaaaattc 
gcctcccccc 
tgagcgggaa 
tcggagatgg 
ggagccagaa 
gccgacagct 
atggacgagg 
gccgccgctg 
ccaggggacg 
ctcaagcgaa 
cagaagggca 
agagtgggca 
aaagctaacg 
gtgtcaagag 
tgcagccacg 
ggcggtcgaa 
ggaccagatc 
cttcctggtg 
cctcccagac 
gccgcggtgg 
tcgccgtgcg 
aaggcgatcc 
cttgggagaa 
actctttgtg 
-tliactiiiaaaa 
ccaaacagcc 
cagagccctg 



acgtgagagg 
aaccacgccg 
ttcttcccgg 
agtgtcttca 
tttcttcttt 
taattttttt 
tctgcttcca 
tgctattggg 
aaagttaagg 
tttcaatttc 
cttctttttt 
ggagatcaca 
taaagaaggt 
aggtttccca 
cacaggttgg 
gcaactgcaa 
tgagatcgcg 
gcagtcctgt 
tctgcctttg 
ctctcctggt 
cgtgctggct 
gcgtgtggac 
ctggctggta 
gcccacgtcc 
actttcccgt 
gtggaagagc 
ggagcgctgc 
ccctccccca 
ttttatagcc 
tgggacgcgc 
tccatgcggt 
ccggcccttt 
ttcaggcaat 
gacagaggtc 
ggaaacagaa 
gcgccccagg 
cggctcccgg 
ggaggggctg 
tgtctgcagg 
tgtgcacctc 
ccgaagccat 
tggtccccgc 
gagcgcgacg 
cccgctccgg 
caacccccgc 
aaagtccgga 
ccactcagag 
aagaccccct 
ccgaccgcgc 
tgacagggac 
acgcggccct 
caggagccgg 
tgcggctccg 
ccgagacgtc 
ccttctccgg 
agggactgtc 
cactgggtct 
ggcttttagt 
gagagtggaa 
tctcactaca 
. c t a ga.aa a 1 1. 
tgtcaagtga 
aaaaggcagg 



cccgttccct 

cccaggaccc 
ccttttatcc 
ttccctctga 
ttgtttgctc 
tttttaacag 
gaatcctaac 
ttatttattt 
taagcccttt 
cagagttcaa 
tctttctcct 
aaactaggaa 
gtaagaagcg 
gggagggcca 
agcctgttcg 
gccaggaaca 
ggatggcggg 
cggcctcgga 
caaagctgcg 
ctacttggcc 
cctacccgga 
atgtccgggt 
aagtgagtcc 
tcattcttcc 
cagcatgccc 
tgggctgcct 
gcctgtgggg 
gcgccgcacg 
tgtgctttca 
acacgcgcgt 
ttgggaaggc 
cccaaaaaaa 
ttcctaacga 
ggcggcctct 
gggaggccag 
agccgggccg 
aagttctgcg 
acttccttgg 
gcggcgcggc 
cagcctgtgc 
ggctaacggc 
tgctcggtgc 
ctctactagt 
gtcgcgctct 
cgagttctca 
gacggaaagt 
cgtctttagg 
tcttctctcc 
tctgcccgcc 
aggtaggtga 
ctgagcgcac 
gccttgccgc 
cgcttccctg 
actccgccgc 
gtgcgactga 
ctgcctgcac 
acacaggcaa 
acgaacccaa 
tggtcaagaa 
tccatttcca 
tgaaa aa cag. 
atgctgcgct 
ctaataaatt 



tgatgtgcaa 
agctctggct 
accagcatag 
tttgggagag 
tgctctaacc 
caaagcctct 
aaatttggaa 
gagctcattt 
ttacaaaact 
ctcctgagag 
tctaagagga 
atagggtgtg 
cgagcccagg 
actcttccgt 
cggctcttgg 
caatgtcctg 
gcagtgagcc 
gagggaactg 
ccccaccggc 
tgtacctcca 
ctcgggcttc 
agcggttcct 
gctgccgcct 
cctgctggct 
acctgcaacg 
ggcgccggag 
tgtgcaggcg 
ttttatttac 
agtatattta 
ggtttacaga 
taggaaaaga 
aaaaaaaaali 
gtggagctcc 
gaaggtcctc 
ggtacgaata 
gtcgagggag 
gggagccagg 
ggcgagaggg 
gccttacctg 
ttggaggagt 
tggggatggt 
tccaagtgaa 
ccttggctac 
aggcgcggag 
agccaagctg 
cagcgggcaa 
gagcaggctt 
ctccctcccc 
ccccccccac 
tattagatcc 
cctccgcaac 
agctcagctc 
ttggcctaac 
ggccctcccc 
cgtggctccg 
ctatcagcag 
gctcccggga 
aggtgaagag 
gagaaaggta 
atcccccacc 
jc a a c a a a t c a 
aatctgaaga 
agaaatcgag 



2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
^480 
"'654 0 
6600 



aagcaaatgg 
ttcactctgg 
attagtcccc 
tgctgaaagc 
attccttgta 
tctcttgttt 
cttgagctaa 
caaataacca 
gctgtcttga 
aagcctagaa 
tcacctttgg 
aaattaaaga 
agcataggtc 
tttactatgg 
gatttgcaat 
attgggaaac 
acttatgtgt 
tttaaacttg 
"tcatcccact 
atcagcattt 
ctgctcagag 
cagctatatt 
tggtaaagaa 
tttatcccgg 
caatgctctt 
ttaaataagg 
aagatcataa 
cctataacta 
tccagtttct 
agcttcattt 
tagtatacac 
ttttggaagc 
atagatttct 
ccactctacc 
cctccaaagt 
cgagatgttg 
ctaatttgtg 
agattaaagt 
ttacagacac 
ttttctttgc 



acccgtcaaa 
atttatacaa 
agacagaaaa 
ttatccattc 
cactgtatta 
actttattat 
ttatatatga 
atttcaagga 
gccattaaag 
ctgcgctcaa 
agacatcaac 
aagaaatcca 
agctggagag 
agttagtgtt 
gccagcatga 
ttattcgatg 
ctctcctttg 
cagtccaaag 
ttaagaataa 
ctgcttttaa 
taaaacacat 
aggtcctgca 
ggttaaatta 
agatttggcg 
acatttgtag 
gttttaaatg 
agtatatgtg 
agtcaataag 
tcttttaaac 
catgaaggga 
agcccctcac 
ttgggtgtta 
aaacccatct 
gatggaactt 
ggttgaaaaa 
gaagcactgg 
tggacccatt 
aaaagcaaaa 
acacacgcac 
atttttccag 



agaaaattac 
gaataaaaag 
cacacaatag 
tacttaacgt 
agctcgtcct 
caatcagatt 
aatatgcctt 
taatttttaa 
tccaagcagg 
ctagcaaaag 
tctttatagc 
aacatattca 
gacaaactaa 
atcatctctg 
agtatcttta 
tggaacaaag 
ccctgaccac 
acgcacatga 
tttagctgca 
ccttttattc 
cctcatgtga 
atcttatcac 
atttacattc 
gagaatctcc 
tggtttttaa 
tttctagccg 
taaagtaaat 
aatccagctc 
ctcagaatag 
ctccatcaca 
tccttgtttt 
atgccttatt 
ctcataaacc 
tcatcacgac 
cccaagggca 
ggatcagcag 
gaagtcaagt 
ccatatctat 
acacacactg 
caggagtttc 
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cttgacttta 
tcgcctcaga 
aagagaaacc 
tgattaagac 
aacccgagag 
taaatccata 
aatgaatttc 
cagtcatttt 
cagaaggggt 
caaaacctta 
actgtttcca 
aaataatttt 
tctcctctgg 
aatgtgtatt 
aaacactccc 
tggatgaagc 
ccccaaaccc 
gaattgtttt 
agggaggaat 
cactttaccc 
caggtctgca 
taaattatac 
tgctcattat 
ttctcagacc 
tctgataaga 
ttttcttatt 
atttcctccc 
ttttctgctg 
ctgtggtccc 
ttaaagaatg 
tcaagattca 
ttagaaagcc 
cacagaattt 
aaatatacat 
cgtgactgct 
cagcctagat 
ggtgaataaa 
ttgtatatat 
gctctgtaaa 
aacattctcc 



aacgaacaac 
tcacgttctc 
ctaacccagc 
acatatccta 
agccacgctt 
aagcctgtag 
catacaatta 
cttttcccag 
gtgtgtgagc 
tttatataaa 
agcaaattta 
tgaaagtcct 
gtttctgcat 
tgtttgacat 
tccttgtcct 
agactacaaa 
tatctgcaac 
tcagtctttc 
ttcttcatag 
cattccacac 
ttagctgagg 
acattacact 
ctggtgctta 
ccacagcgtt 
ctctaatttg 
gaatttcctc 
attgcactgc 
aatgtgttta 
cacaatacca 
aaaaaaatct 
aaccccagag 
gagaagcccc 
tgataaaagc 
gtatgaagga 
cctcatagtg 
gcctaaaaag 
gacaattatc 
atattcacat 
caactgactc 
taatctccta 



tgtttggtgg 
tgtgatgctt 
gttttcaaaa 
gatctttcaa 
taaattcgac 
aatcaacaac 
agaatgttgc 
tgagctcaag 
taagggcgaa 
acaaaaaaaa 
atttccaaag 
tttgtccccc 

gggcgattgt 

tacagtcaat 
tgttcacaag 
tatatttgca 
tcctccccat 
ttcaccagta 
taagctttaa 
atacagacac 
ctcatacatc 
agcagcctgt 
aatgacgcat 
tcactgaaga 
cttaagtctt 
taattccccc 
cagccgatga 
ctaatcatat 
tgccccttaa 
ccactgtagt 
ctgcaaatat 
acagagccat 
tctggtggct 
cctcaatcag 
ccaacgtgtg 
ataaggtgtc 
tagataattc 
ccattttata 
aaagtgagga 
atcactttac 



6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

822,0 

8280 

8340 ■ 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9001 



<210> 2 
<211> 9001 
<212> DNA 

<213> Artificial S eguence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 2 



agttgatgga 
taatggaatt 
gttttttttt 
atttattttt 
agggaattgg 
gaggggagaa 
tatatttgtt 
agggtaggtt 
aaaattgata 
tatttagtat 
tgttttattt 



tttgttaaat 
ttttaggttt 
ttttcgggaa 
attaaattta 
tttgtttagg 
agaatttaga 
ttatatttgt 
gtatatagta 
tttatatatt 
atatattttt 
taaagtttaa 



tttttttttt 
aagttaaaga 
tgtttttttt 
aataaggatt 
aagtagtgat 
ttaatttgta 
tgaaaagtaa 
gttttagaaa 
ttttataata 
ttgttcgtaa 
tttttaaaaa 



tttttttttt 

aaaattggag 

ttttttttta 

gttgttttgt 

tgagatgttt 

tgtagtatat 

aataataata 

tattattgta 

ataataatat 

gcgaaaaatt 

tattgttgtg 



tttatattat 
agataaaatt 
gtttttgatg 
atgtttaatt 
tggttaagtt 
tttattttta 
ttgtacgaaa 
tttattagag 
gttagaaata 
ttaatcgttt 
atattattaa 



ttgttagtta 

agattttgta 

aatggttatt 

aggtaggtag 

agtgatagag 

tgaaataaaa 

tgttatatat 

aaattatttt 

tatagtgtgg 

ttgtataata 

taattgtttt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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tataaaatta 
atgttaaatt 
atttgtttgt 
attgaggata 
tttatatagt 
tagtttttta 
tgcggtttat 
ttgtacgttg 
gttcgagtta 
aggcgtcggt 
gttgagattg 
tatggacgag 
tatagagttg 
ggtttagttg 
gaattgcggt 
ttatttggtt 
gatgttatag 
atatacgtta 
gatattaggg 
ttgttttggt 
agatttgcgg 
gtaaaataaa 
tttttatttt 
atcgtttttt 
ttttttggta 
aaataattaa 
ggtatattaa 
ttgttttttt 
cgtttggttt 
gtcgtaattt 
tgtattggag 
tatattttcg 
tttttttttt 
tatttggttc 
cgcg^ttata 
cgaaggattt 
ttatttaaat 
gagagttaat 
ttggatagtt 
ttttatgttt 
tattttggtt 
gggggtttta 
ggatgtcgtt 
ttttttatcg 
ttgttaattt 
atgattataa 
aagataaata 
ttagtagttg 
tgggagagat 
agggaaaaag 
gtttttggcg 
tttggtaggg 
tgtaggggat 
ggtgtcgttt 
taatttcgta 
gtttttttag 
taatattttt 
cgtaaggttg 
ttggaaagtg 
ttgtcgtttt 
— _t-t4:^tt-t.t-t:.t 
cggacgttaa 
gagggtgcgc 



atttgatatt 
taaaaaataa 
gtatgtgttt 
tttttttgaa 
tttttttttt 
agggcggagt 
acgggtcggt 
gcgtagtcga 
tacgtgtttt 
atcgcggaat 
gagttcggga 
atagagttgg 
aagaagggga 
ttgtaggaat 
tcgaagttat 
cgacgatttt 
tgtagattat 
gcgtatttat 
ttagttttta 
ttgcgagttg 
acggtttgat 
cgcggggtag 
aaagttatag 
tttttacgta 
cgtgtttttt 
ataaagagta 
atttcgggaa 
tttattttaa 
tgtaggcggt 
cgattttaat 
agaggtttgg 
tcgggtttag 
attttttttt 
gataagtagt 
aagtattagg 
tagaaaattt 
tttgtttagg 
gaaagattga 
aaattaaaat 



tcgatatata 

taagtttata 
ttgtaattgt 
attaaaggag 
tttttttttt 
tgttttattc 
ttattgtata 
agttggagtg 
tataaatata 



ttagcgacgg 
cgtttgttat 

gtggggaaaa 
agtttttggt 
agtttgggta 
ttttgtatag 
tgaattaaat 
taaaattttt 
atattttggt 
gttattggtt 
ggattcggag 
tatttagtta 
taggtttttt 
atttacggat 
attttttttt 
taatagggat 
aaagattatt 
ttcggaaagg 
gtttattata 
tatttaggtc 
ttatatcggg 
tttgagatat 
tcgtgtaatt 
ttttttgtat 
cgttttggaa 
gtggagttgc 
atagattttg 
cgagaagagt 
ttttgtttaa 
tatttttaat 
tgtataaatg ttatttagag 
tttatttttt ttatcgttgt 
ttttttttat ttagagtatt 
tgatttgttt gattttgttt 
attagtataa attaggacgt 
ataaagtata gatttgttat 
tttagaagag ggggtgtgag 
aattaagtag aaaagttttt 
tgtattaaaa ttttgttttc 
ttgaatggtc gaaataatcg 
ttgggttagg gtcgggataa 
ggtttttttt gttaaaggtt 
attttattag ttttgttttg 
tgtttatgta gtttagttcg 
tgggagtttg agttagggcg 
attaggtcgt cgcgaggttt 
gcggcgttgt tttttatatt 
tttttatttt ttttagattt 
gtttatatag cgattttttc 
gtttttagtt tttggagttg 
tttttagacg ggttttcggc 
._t£JLgaaaacg__ aaat at at at 
ttggatcggc ggtagaagtc 
gcggcggttt cgggtcgcga 



tatatttttt 
gtaatgttaa 
tagaaatatt 
tttttttttt 
ttagtgttta 
ggtaaggttt 
ttggtaagta 
ttgttttgtt 
cggaggagtc 
gttatttagg 
tgttgagggt 
tatgttttgt 
ggatagggag 
tatgtcgtcg 
ttcggtttgt 
ttgggggcgg 
atcggaggtt 
ttcgttttat 
cgggttttta 
ttacgttata 
ggcgtttttt 
taattttttt 



gaggagaagg 
tttttttttg 
taggtcgtta 
ttttcgttag 
gtagttttgg 
ggtttgttcg 
ggtcggtttt 
tttttttgtc 
tcgagttgat 
cgtttttttt 
tcgattgtta 
ggagaggtag 
ggaatagcgg 
tttttaatta 



tagttattta 
aatgttatat 
tgtagtgaaa 
gatttagtgg 
cgatttttta 
taggatttcg 
tttaggttgg 
tttagtttta 
ggcggcgcgt 
ttgtttaagt 
attatgttgg 
gatgataggg 
gcggatgtaa 
tagggttgta 
tggttgcgtt 
ttggggtaag 
aattttcgtt 
tgtatcgttt 
ttaagcgtag 
aattttagtc 
taggtttaaa 
cggggtattt 
ggttggaagg 
atttttattg 
atattttttt 



ttcgttaatt 
agattttttt 
cgcgttaggt 
tattcgtgtt 
gttttagacg 
attagtgatg 
tttttttttt 



acgtgagagg 
aattacgtcg 
tttttttcgg 
agtgttttta 
tttttttttt 
taattttttt 
tttgttttta 
tgttattggg 
aaagttaagg 
ttttaatttt 
tttttttttt 
ggagattata 
taaagaaggt 
aggtttttta 
tataggttgg 
gtaattgtaa 
tgagatcgcg 
gtagttttgt 
tttgtttttg 
tttttttggt 
cgtgttggtt 
gcgtgtggat 
ttggttggta 
gtttacgttt 
atttyitcgt 
gtggaagagt 
ggagcgttgc 



taaaaagtac 
ttgtaaggag 
gcggggtggg 
tttgttattt 
ttcgtttttt 
tttaggattt 
ttttttattt 
ttttttttga 
ttgtttgttt 
tttttaatag 
gaattttaat 
ttatttattt 
taagtttttt 
tagagtttaa 
tttttttttt 
aaattaggaa 
gtaagaagcg 
gggagggtta 
agtttgttcg 
gttaggaata 
ggatggcggg 
cggtttcgga 
taaagttgcg 
ttatttggtt 
tttattcgga 
atgttcgggt 
aagtgagttc 
ttattttttt 



aatgttaata 
ttagttaaat 
gatgttagat 

tttttttttt 
gtataatttt 
gcgttgtggg 
aggtcgggtt 
ggttggttag 
aaggataggt 
tatttaggtt 
acgatatgtt 
ggttgacgtt 
ggtttttggc 
tgagtttatt 
tttttttttt 
ggagtaaata 
ttttttcgat 
tgtatattaa 
gagatttggt 
gaacgtatgg 
aatatttaat 
tgtaaatttg 
gtattagagg 
tcgtttttta 
tcgtttagta 
ttaggagttt 
ttttttcgtt 
ttagtcgggt 
cggtggttta 
gcggtttttg 
ttttatatta 
ttatttttga 
gttttatttt 
agtttagcgt 
agggcgtttt 
ttattttagg 
tgatgtgtaa 
agttttggtt 
attagtatag 
tttgggagag 
tgttttaatc 
taaagttttt 
aaatttggaa 
gagtttattt 
ttataaaatt 



tagtatgttt atttgta 



tttttgagag 
tttaagagga 
atagggtgtg 
cgagtttagg 
attttttcgt 
cggtttttgg 
taatgttttg 
gtagtgagtc 
gagggaattg 
ttttatcggc 
tgtattttta 
ttcgggtttt 
agcggttttt 
gttgtcgttt 
tttgttggtt 
atttgtaacg 



tgggttgttt 
gtttgtgggg 



ggcgtcggag 
tgtgtaggcg 



720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
"4380 
4440 



72 



taagtgtggg 
atgtttattt 
tatatttttg 
tttttttttt 
tgtggggatt 
gttttttcgc 
tcgggagttg 
gaattaagat 
gagagggcgg 
aattaaaggg 
cgaacgatta 
agcgggtggc 
cggttttcgg 
ttagcgatcg 
gataggaaga 
gcgggttttt 
tgtttcgtcg 
ttttttcgtt 
ttttcgtttt 
acgaagatat 
ttaagtttta 
taagaatttt 
gcggtagttt 
tttgcggcgg 
gcgtatacgt 
taggtattta 
attttaaaat 
agtttttttc 
tattaattag 
tgcggggtcg 
atttagtttt 
atgaggttag 
taaggattaa 
ttatataaaa 
ttttttcgat 
agttttaatt 
aagtaaatgg 
tttattttgg 
attagttttt 
tgttgaaagt 
attttttgta 
ttttttgttt 
tttgagttaa 
taaataatta 
gttgttttga 
aagtttagaa 
ttatttttgg 
aaattaaaga 
agtataggtt 
tttattatgg 
gatttgtaat 
attgggaaat 
atttatgtgt 
tttaaatttg 
ttattttatt 
attagtattt 
ttgtttagag 
tagttatatt 
tggtaaagaa 
tttatttcgg 
taatgttttt 
ttaaataagg 
aagattataa 



tgttcgcgtt 
tattgtagcg 
cgtagatata 
tttcgtagaa 
cggttgggta 
gaagggtatt 
aaagtcgaga 
gttgggattt 
tagaattgtt 
atgcggggta 
tttttattac 
gtagagtagt 
gttggaggtg 
gggttgatcg 
tcfaggagacg 
tatgtagttt 
agttttcgta 
gcggggagag 
tttcggaagg 
gggatgtggg 
aagcgaaata 
tttaataagg 
tgatagagaa 
tagtagtcgt 
atatttttcg 
ggcgagcgac 
tagaggcggg 
gttttcgttt 
gacgtttcga 
ggttattgtt 
tgtttagttt 
gagtcgaagg 
taagatattt 
aggagatacg 
tttaaatttt 
gtaaagaaga 
attcgttaaa 
atttatataa 
agatagaaaa 
ttatttattt 
tattgtatta 
attttattat 
ttatatatga 
attttaagga 
gttattaaag 
ttgcgtttaa 
agatattaat 
aagaaattta 
agttggagag 
agttagtgtt 
gttagtatga 
ttattcgatg 
tttttttttg 
tagtttaaag 
ttaagaataa 
ttgtttttaa 
taaaatatat 
aggttttgta 
ggttaaatta 
agatttggcg 
atatttgtag 
gttttaaatg 
agtatatgtg 



ttattttttt 



gtatatttat 
ttaaattttt 
agtttagatt 
tcgaagttcg 
tttgagtggt 
ggaaaatagg 
ttgtgattta 
cgcgttttta 
gttaaaattt 
gttttttttc 
tgagcgggaa 
tcggagatgg 
ggagttagaa 
gtcgatagtt 
atggacgagg 
gtcgtcgttg 
ttaggggacg 
tttaagcgaa 
tagaagggta 
agagtgggta 
aaagttaacg 
gtgttaagag 
tgtagttacg 
ggcggtcgaa 
ggattagatt 
ttttttggtg 
ttttttagat 
gtcgcggtgg 
tcgtcgtgcg 
aaggcgattc 
tttgggagaa 
attttttgtg 
ttatttaaaa 
ttaaatagtt 
tagagttttg 
agaaaattat 
gaataaaaag 
tatataatag 
tatttaacgt 
agttcgtttt 
taattagatt 
aatatgtttt 
taatttttaa 
tttaagtagg 
ttagtaaaag 
tttttatagt 
aatatattta 
gataaattaa 
attatttttg 
agtattttta 
tggaataaag 
ttttgattat 
acgtatatga 
tttagttgta 
ttttttattt 
ttttatgtga 
attttattat 
atttatattt 



gagaattttt 
tggtttttaa 
tttttagtcg 
taaagtaaat 



ttttttttta 
ttttatagtt 
tgggacgcgt 
tttatgcggt 
tcggtttttt 
tttaggtaat 
gatagaggtc 
ggaaatagaa 
gcgttttagg 
cggttttcgg 
ggaggggttg 
tgtttgtagg 
tgtgtatttt 
tcgaagttat 
tggttttcgt 
gagcgcgacg 
ttcgtttcgg 
taattttcgt 
aaagttcgga 
ttatttagag 
aagatttttt 
tcgatcgcgt 
tgatagggat 
acgcggtttt 
taggagtcgg 
tgcggtttcg 
tcgagacgtt 
tttttttcgg 
agggattgtt 
tattgggttt 
ggtttttagt 
gagagtggaa 
ttttattata 
ttagaaaatt 
tgttaagtga 
aaaaggtagg 
tttgatttta 
tcgttttaga 
aagagaaatt 
tgattaagat 
aattcgagag 
taaatttata 
aatgaatttt 
tagttatttt 
tagaaggggt 
taaaatttta 
attgttttta 
aaataatttt 
ttttttttgg 
aatgtgtatt 
aaatattttt 
tggatgaagt 
ttttaaattt 
gaattgtttt 
agggaggaat 
tattttattt 
taggtttgta 
taaattatat 
tgtttattat 
tttttagatt 
tttgataaga 
tttttttatt 
attttttttt 



gcgtcgtacg 
tgtgttttta 
atacgcgcgt 
ttgggaaggt 
tttaaaaaaa 
tttttaacga 
ggcggttttt 
gggaggttag 
agtcgggtcg 
aagttttgcg 
atttttttgg 
gcggcgcggc 
tagtttgtgt 
ggttaacggt 
tgttcggtgt 
ttttattagt 
gtcgcgtttt 
cgagttttta 
gacggaaagt 
cgtttttagg 
tttttttttt 
tttgttcgtt 
aggtaggtga 
ttgagcgtat 
gttttgtcgt 
cgtttttttg 
atttcgtcgc 
gtgcgattga 
ttgtttgtat 
atataggtaa 
acgaatttaa 
tggttaagaa 
tttattttta 
tgaaaaatag 
atgttgcgtt 
ttaataaatt 
aacgaataat 
ttacgttttt 
ttaatttagc 
atatatttta 
agttacgttt 
aagtttgtag 
tatataatta 
ttttttttag 
gtgtgtgagt 
tttatataaa 
agtaaattta 
tgaaagtttt 
gtttttgtat 
tgtttgatat 
tttttgtttt 
agattataaa 
tatttgtaat 
ttagtttttt 
ttttttatag 
tattttatat 
ttagttgagg 
atattatatt 
ttggtgttta 
ttatagcgtt 
ttttaatttg 
gaattttttt 
attgtattgt 



ttttatttat 
agtatattta 
ggtttataga 
taggaaaaga 
aaaaaaaaat 
gtggagtttt 
gaaggttttc 
ggtacgaata 
gtcgagggag 
gggagttagg 
ggcgagaggg 
gttttatttg 
ttggaggagt 
tggggatggt 
tttaagtgaa 
ttttggttat 
aggcgcggag 
agttaagttg 
tagcgggtaa 
gagtaggttt 
tttttttttt 
ttttttttac 
tattagattt 
ttttcgtaac 
agtttagttt 
ttggtttaat 
ggtttttttt 
cgtggtttcg 
ttattagtag 
gttttcggga 
aggtgaagag 
gagaaaggta 
attttttatt 
taataaatta 
aatttgaaga 
agaaatcgag 
tgtttggtgg 
tgtgatgttt 
gtttttaaaa 
gattttttaa 
taaattcgat 
aattaataat 
agaatgttgt 
tgagtttaag 
taagggcgaa 
ataaaaaaaa 
atttttaaag 
tttgtttttt 
gggcgattgt 
tatagttaat 
tgtttataag 
tatatttgta 
ttttttttat 
tttattagta 
taagttttaa 
atatagatat 
tttatatatt 
agtagtttgt 
aatgacgtat 
ttattgaaga 
tttaagtttt 
taattttttt 
tagtcgatga 



4500 

4560 

4 620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 



-73- 



tttataatta 
tttagttttt 
agttttattt 
tagtatatat 
ttttggaagt 
atagattttt 
ttattttatc 
tttttaaagt 
cgagatgttg 
ttaatttgtg 
agattaaagt 
ttatagatat 
ttttttttgt 



agttaataag 
ttttttaaat 
tatgaaggga 
agttttttat 
ttgggtgtta 
aaatttattt 
gatggaattt 
ggttgaaaaa 
gaagtattgg 
tggatttatt 
aaaagtaaaa 
atatacgtat 
atttttttag 



aatttagttt 

tttagaatag 

ttttattata 

tttttgtttt 

atgttttatt 

tttataaatt 

ttattacgat 

tttaagggta 

ggattagtag 

gaagttaagt 

ttatatttat 

atatatattg 

taggagtttt 



ttttttgttg 
ttgtggtttt 
ttaaagaatg 
ttaagattta 
ttagaaagtc 
tatagaattt 
aaatatatat 
cgtgattgtt 
tagtttagat 
ggtgaataaa 
ttgtatatat 
gttttgtaaa 
aatatttttt 



aatgtgttta 
tataatatta 
aaaaaaattt 
aattttagag 
gagaagtttt 
tgataaaagt 
gtatgaagga 
ttttatagtg 
gtttaaaaag 
gataattatt 
atatttatat 
taattgattt 
taatttttta 



ttaattatat 
tgttttttaa 
ttattgtagt 
ttgtaaatat 
atagagttat 
tttggtggtt 
ttttaattag 
ttaacgtgtg 
ataaggtgtt 
tagataattt 
ttattttata 
aaagtgagga 
attattttat 



8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9001 



<210> 3 
<211> 9001 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 



<400> 3 

tgtaaagtga 

atttttattt 

atataaaatg 

tgaattattt 

ggatatttta 

gtatacgttg 

gttgattgag 

gagttattag 

tatggttttg 

aatatttgta 

aattatagtg 

tttaaggggt 

aatatgatta 

gttatcggtt 

tgggggaatt 

aaagatttaa 

gtttttagtg 

aatgcgttat 

aataggttgt 

ggatgtatga 

ggtgtttgta 

tttaaagttt 

atattggtga 

aatggggagg 

ttgtaaatat 

ttttgtgaat 

tattgattgt 

aataatcgtt 

tgggggataa 

tttttggaaa 

attttttttg 

tttcgttttt 

ttttgagttt 

ggtaatattt 

ggttgttgat 



tttgaaagat 
attttgaaaa 



ttaggagatt 
tgagttagtt 
gatgtgaata 
agataattgt 
tttttttagg 
gtattatgag 
gttttttata 
agtttttatt 

tggggttttt 
gttttggggt 
gagatttttt 
atggtattgt 
gtaaatatat 
ggtagtgtaa 
agaggaaatt 
gtaaattaga 
aaacgttgtg 
ttaagtatta 
tagtgtaatg 
gttttagtta 
tgtgtggaat 
attatgaaga 
agaaagattg 
agttgtagat 
atttgtagtt 
aaggataagg 
aatgttaaat 
tatgtagaaa 
aaggattttt 
ttaaatttgt 
ttttatataa 
agtttatata 
attgggaaaa 
ttaattgtat 
tttataggtt 
-aaagGgt^ggt- 
ttaggatatg 
cgttgggtta 



aggagaatgt 
gtttatagag 
tatatatata 
ttttatttat 
tatttaggtt 
gagtagttac 
tatgtatatt 
aaaattttgt 
cggtttttta 
ttgaattttg 
ttatttttta 
ggggattata 
ttagtagaaa 
tgggaggaaa 
taataagaaa 
gttttattag 
gggtttgaga 
gataatgagt 
tgtataattt 
atgtagattt 
ggggtaaagt 
aatttttttt 
aaaaataatt 
agggtttggg 
tgttttattt 
agggagtgtt 
aaatatatat 
tttagaggag 
aaaaattatt 
ttggaaatag 
ataaggtttt 
tatttttttt 
gaaaatgatt 
ggaaatttat 
ttatggattt 
1 1 1 ttcgg gt- 
tgttttaatt 
gggttttttt 



tgaaattttt 

ttagtgtgtg 
aatagatatg 
tatttgattt 
gttgttgatt 
gtgtttttgg 
tgtcgtgatg 
gggtttatga 
aaataaggta 
aaaaataagg 
atgtgatgga 
gttattttga 
agagttggat 
tatttatttt 



acggttagaa 
attaaaaatt 
aggagatttt 
agaatgtaaa 
agtgataaga 
gttatatgag 
ggaataaaag 
ttgtagttaa 
tttatgtgcg 
ggtggttagg 
attttgtttt 
ttaaagatat 
ttagagatga 
attagtttgt 
ttgaatatgt 
tgttataaag 
gtttttgtta 
gtttgtttgg 
gttaaaaatt 
taaggtatat 
aaatttgatt 
taggacgagt 
aacgttaagt 
tttattgtgt 



gttggaaaaa 
tgtgcgtgtg 
gttttgtttt 
taatgggttt 
tttagtgttt 
gttttttaat 
aaagttttat 
gagatgggtt 
ttaatattta 
agtgaggggt 
gtttttttat 
ggtttaaaag 
ttttattgat 
atatatatat 
atatttaaaa 
attataaatg 
tcgttaaatt 
ttaatttaat 
ttgtaggatt 
gatgtgtttt 
gttaaaagta 
attattttta 
tttttggatt 
gtaaaggaga 
atatcgaata 
tttatgttgg 
taatattaat 
tttttttagt 
ttggattttt 
agttgatgtt 
gttgagcgta 
attttaatgg 
atttttgaaa 
tttatatata 
gataataaag 
_t t aat a t a gt 
agaatggata 
gttttttgtt 



tgtaaagaaa 
tgtgtttgta 
tattttaatt 
atataaatta 
ttaatatttc 
tattttggag 
cggtagagtg 
tagaaattta 
agtttttaaa 
tgtgtatatt 
gaaatgaagt 
aagaaattgg 
ttagttatag 
tttatgattt 
tttttattta 
taagagtatt 
ttcgggataa 
tttttttatt 
taatatagtt 
attttgagta 
gaaatgttga 
aagtgggatg 
gtaagtttaa 
gatatataag 
agttttttaa 
tattgtaaat 
tttatagtaa 
tgatttatgt 
ttttttaatt 
tttaaaggtg 
gttttaggtt 
tttaagatag 
ttggttattt 
attagtttaa 
taaataagag 

a 



agtttttagt 
tggggattaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 

2220 
2280 



-74- 

taagtattat agagaacgtg atttgaggcg attttttatt tttgtataaa tttagagtga 2340 

attattaaat agttgttcgt ttaaagttaa ggtaattttt ttttgacggg tttatttgtt 2400 

tttcgatttt taatttatta gtttgttttt ttagggtttt gttttttttg taattaaagt 24 60 

ttttttagat tagcgtagta tttatttgat aggttgtttg gaaaatttaa gatcggagag 2520 

gtgatttgtt gttgtttttt aaatttttta gttttaagta acgtgttttt tttttatatg 2580 

gggtggggga ttggaaatgg atgtagtgag atataaagag tgggtgtttt gttgattttt 2640 

gtattttttt ttttttgatt attttatttt ttttttttaa gttttcgatt tttagtttta 2700 

tttttttatt tttgggttcg tattaaaagt cggatcgttt tgggttgggt aggagttgaa 2760 

ttttcgggag tttgtttgtg tagatttagt gcgtacggcg aggtagtagt tcggtttcgt 2820 

attgttgata ggtgtaggta ggatagtttt tttatcgcgg ttcggggcgt tttgattggt 2880 

gcggagttac gttagtcgta ttcggagaag ggtttgggag gaggcggagg cggagagggt 294 0 

tggggagggt cgcggcggag tgacgtttcg gtattaggaa gttcgttttt ggttttaaga 3000 

tgttaggtta atagggaagc gcggagtcgt agatttggtt cgtcgttcgt ttgggtgttt 3060 

ggagttgagt tgcggtaagg ttcggttttt gttcgatcgt tcgaggggtg tgcgtgtgcg 3120 

cgttgcggag ggtgcgttta gagggtcgcg tcgtggttgt agcggttgtt gtcgtcgtag 3180 

gggatttaat attatttatt tgtttttgtt atttttgata tttttttgtt agggttgtcg 324 0 

cgtggggggg gggcgggtag agcgcggtcg gcgttagttt tttttattgg aggggttttt 3300 

gggggaggga gggagagaag aagggggttt ttgtttattt ttgtttcgtt ttggagtttg 3360 

gaagtttgtt ttttaaagac gttttgagtg gtgttttttt gtttatattt tatgttttcg 3420 

tttgttcgtt gatttttcgt tttcggattt tttcgtttga gtttttcgga ggagacgggg 34 80 

gtagtttggt ttgagaattc ggcgggggtt gcgttttttg gtttttttcg tagcggggaa 3540 

atttcgcgtt tagagcgcga ttcggagcgg gtagcggcgg ttacgggggt tcggcggggt 3600 

agtagttaag gattagtaga gcgtcgcgtt ttttcgttta tgaattgtat gaaaggttcg 3660 

ttttatttgg agtatcgagt agcggggatt aagttgtcgg tcgttttttt atttttttgt 3720 

tattattttt agtcgttagt tatggtttcg gttttggttt tcggttagtt tcggtcgttg 3780 

gattttttta agtataggtt ggaggtgtat attattttcg atatttttag ttcggaggtc 3840 

gtaggtaagg cgtcgcgtcg ttttgtagat attttcgttt agttgttttg cgttattcgt 3900 

tttttttcgt tttaaggaag ttagtttttt cggggggagg cgtggtggga gtggtcgttc 3960 

gtttggtttt tcgtagaatt ttcgggagtc ggaattttga ttatttcgta tttttttagt 4020 

tttttttcga tcggttcggt ttttggggcg ttaagggcgc gagtaatttt gtcgtttttt 4080 

ttattcgtat tttggttttt tttttgtttt ttgggttata aaaattttag tattttgatt 4140 

cgaggatttt tagaggtcgt cgatttttgt ttttgttttt ttttcggttt ttagttttcg 4200 

aggagtttta ttcgttagga aattgtttga aattatttag aaatgttttt cgcgaagagg 42 60 

tatttttttt ttttttttgg gaaagggtcg gcgaatttcg gtgtttaatc gaatttttat 4320 

attttttttt agttttttta aatcgtatgg aaatttgagt tttttgcgag ggggaggggg 4380 

gtttgtaaat tacgcgcgtg tgcgcgtttt aggagatttg gtgtgtttgc gtagaggtgt 44 40 

ataaatatat ttgaaagtat aggttataaa agtgaatgtg tcgttgtagt gagataaata 4500 

tgtaaataaa acgtgcggcg ttgggggagg ggaggaaatg gggcgcggat atttatattt 4 5 60 

gcgtttgtat attttatagg cgtagcgttt ttcgcggttc ggagtcgtcg cgcgtatttt 4 620 

ttttcggcgt taggtagttt agttttttta cggtttttgt cgtcggttta gttggcgttc 4 680 

gcgttgtagg tgggtatgtt gacgggaaag tgtgtgtgtt tcgtttttag agaaagataa 4740 

aagttagtag gggaagaatg aggacgtggg cgtcgaggat tcgtttaaga agaagcggta 4800 

aaggcggtag cggatttatt ttattagtta gtagttttag gagttggagg ttatttttta 4 8 60 

gaggaatcgt tattcggata tgtttatacg cgaagaaatc gttgtgtgga ttaattttac 4 920 

ggaagttcga gttcgggtag gagttagtac ggagtttggg agggatgggg ggaggatgtt 4 980 

gtggaggtat aggttaagta gattaggaga gaatgtggaa ggtagcgtcg tttgggaggg 5040 

cgtcggtggg gcgtagtttt gtaaaggtag aaggtttcgc ggcggtttgg ttgcgagatt 5100 

atagtttttt tttcgaggtc gataggattg tcgttttggt ttaggttttt agagcggtat 5160 

cggtttattg tttcgttatt tcgcgatttt acgagttggg ttgtatgggt aattttttgt 5220 

ataggatatt gtgtttttgg tttgtagttg ttagagtaga gttaataaaa tttttattag 5280 

gttaagagtc gcgaataggt tttaatttgt gagtttttaa taaggaaaat tcgttagaga 5340 

tacggaagag ttggtttttt ttgggaaatt tttgtttcgg ttttggttta gttttttttt 5400 

ttttgggttc gcgtttttta tatttttttt acggttgttt cggttattta ggtttttttt 54 60 

atatatttta ttttttagtt ttgtgatttt cgggagtaaa gttttaatat ataattatta 5520 

gtttttttag aaggagaaag aaaaaaagaa gaaagatttt tttgtttggt ttatttattt 5580 

ttttttagga gttgaatttt ggaaattgaa atttatattt ttttttttaa attataatta 5640 

tagttttgta aaaagggttt attttaattt tgtagtaaat ttgtatttta tggattggta 5700 

aaaatgagtt taaataaata atttaatagt aacgttttgg tttatgttgg tcggtggaag 5760 

attttaaatt tgttaggatt ttggaagtag aaaatagaat taagtaaatt aagcggtatt 5820 

tagaggtttt gttgttaaaa aaaaaaaatt aagtgttttg ggtagaaaaa ataaagtttt 5880 

cggttagagt agagtaaata aaaagaagaa aataacgata aaaagaataa agattaaaat 5940 

gtttttttaa attagaggga atgaagatat tttttgggtg gtatttgtgt aaggtatgag 6000 

gttatgttgg tggataaaag gtcgggaaga agttgaaaat ggttttagtt taattgttta 6060 



-75 



gagttagagt tgggttttgg gcggcgtggt tttgagtaag gttagttttt tattagtttt 6120 

tttgtatatt aagggaacgg gttttttacg tatttttttc gtttgagtaa agtttagatg 6180 

gtttagggta gaaatggtaa gtaattaaag atagagttta tgggtttttt gggatttttc 6240 

gaaaacgttt ttttatttcg ttcgttattt cgtagtttta ttttagtgtt ttgtagtcgc 6300 

ggcgttgggt tttttttgta gttgtttttt tttttagggc ggttgtttgt cgagttaagt 6360 

gggagtgagg cgtgtttttt atagtagtcg ggtgtaaaga ggaaggggga taaaaaggaa 64 20 

attaagaatg aaaggaaaaa gagaaaaagc ggattatacg gttgggttcg gcggagatgt 6480 

gtaatgtgaa atattattgg tgttagttcg gatattttag gttaggtttt tttttaatat 6540 

ataaaagtcg tcgtttgggg cgatagggag gttcgatgtg gattgggatc ggggttgcgg 6600 

ttgggttatc ggatacgggt ggaagtcggt cggtttgggt ggtcgtttgt aaagttaaac 6660 

gattcggttg ggtttggcgc gcggataggt ttgtggtggg tttagggtaa agaagaggta 67 20 

gagcgaaaga agggggaatt tttaaaatta tttttttcgg gttttcggag tttaatatgt 67 80 

taagtttttg gagttaacga gttgacgaag aggtggtttt ttgtttttta tttggttgtt 6840 

ttgttaggcg agaaagagtg ttggcggttt agtttttgtt aagggagtac gtattagggg 6900 

gtgggggacg atagtggagg ttagggaagg aagggaggaa ttgcgtggga gaaagagcga 6960 

ttttttagtg tttttttagt tttttttttt tattcgtggg tttgtggttt tggaatggaa 7020 

gtaagtttgt aaggtgtttc gggaagggtt ggaaaagttt gttgtttcgc gtttgtttta 7080 

tattaagtgt ttttggattt ggagaaacgt ttggttgagt gattaaatcg ttcgtaggtt 7140 

tttatgcgtt cggttgaggt ttgtggcgta gtttcgagtt ttagttcgta ggttagagta 7200 

gattaggttt tttgcgtttg gtggagattc gggttagtaa ttgaaagttg gttttggtat 7260 

tttggtgtgt agggcggtgt agtgaagcga ggttagggtg tgtgagtgcg ttagcgtgtg 7320 

tgtcggggga aggcgggggt tggttttcga tggaagtttt agtaatttgt attgtggtat 7380 

ttgtttgttt ttttgtttta atcgttttta ggtttggttt aagaatcgtc gggttaaatg 74 40 

gagaaagagg gagcgtaatt agtaggtcga gttatgtaag aatggtttcg ggtcgtagtt 7500 

taatgggttt atgtagtttt acgacgatat gtatttaggt tatttttata ataattgggt 7560 

cgttaagggt tttatattcg tttttttatt tattaagagt ttrtttttttt ttaattttat 7620 

gaacgttaat tttttgttat tatagagtat gtttttttta tttaatttta tttcgtttat 7680 

gagtatgtcg tttagtatgg tgtttttagt agtgataggc gtttcgggtt ttagttttaa 7740 

tagtttgaat aatttgaata atttgagtag ttcgtcgttg aatttcgcgg tgtcgacgtt 7800 

tgtttgtttt tacgcgtcgt cgattttttc gtatgtttat agggatacgt gtaattcgag 78 60 

tttggttagt ttgagattga aagtaaagta gtattttagt ttcggttacg ttagcgtgta 7920 

gaattcggtt tttaatttga gtgtttgtta gtatgtagtg gatcggttcg tgtgagtcgt 7 980 

atttatagcg tcgggatttt aggattttgt cggatggggt aatttcgttt ttgaaagatt 8040 

gggaattatg ttagaaggtc gtgggtatta aagaaaggga gagaaagaga agttatatag 8100 

agaaaaggaa attattgaat taaagagaga gtttttttga ttttaaaggg atgtttttag 8160 

tgtttgatat tttttattat aagtattttt aatagttgta aggatatata tataaataaa 8220 

tgtttgattg gatatgatat tttaatatta ttataagttt gttatttttt aagtttagta 8280 

ttgttaatat ttaaatgatt gaaaggatgt atatatatcg aaatgttaaa ttaattttat 834 0 

aaaagtagtt gttagtaata ttataatagt gtttttaaag gttaggtttt aaaataaagt 84 00 

atgttatata gaagcgatta ggatttttcg tttgcgagta agggagtgta tatattaaat 84 60 

gttatattgt atgtttttaa tatattatta ttattataaa aaatgtgtga atattagttt 8520 

tagaatagtt tttttggtgg atgtaatgat gtttttgaaa ttgttatgta taatttattt 8580 

tgtgtataat atttcgtata atattattgt tttatttttt agtaaatatg aaataaatgt 8 640 

gttttatttt atgggagtaa aatatattgt atataaattg gtttggattt tttttttttt 87 00 

tttttgttat taatttggtt aggatatttt agttattgtt ttttaaataa attagttttt 87 60 

tttgtttgtt tagttaaata tataaggtag tagtttttat ttaaatttgg tagaaataaa 8820 

tgatagttat ttattagaaa ttaaaaagaa aaaaaaaggt attttcgggg gggaaaaggg 8880 

ttataaaatt taattttgtt tttttaattt ttttttggtt taaatttaga ggattttatt 894 0 

atggttagta aataatatga aaaagaaaaa agaagaaaga aatttagtaa gtttattagt 9000 

t 9001 



<210> 4 
<211> 9001 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 4 



agttgatgga tttgttaaat tttttttttt tttttttttt tttatattat ttgttagtta 60 
taatggaatt ttttaggttt aagttaaaga aaaattggag agataaaatt agattttgta 120 
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gttttttttt 

atttattttt 



agggaattgg 
gaggggagaa 
tatatttgtt 
agggtaggtt 
aaaattgata 
tatttagtat 
tgttttattt 
tataaaatta 
atgttaaatt 
atttgtttgt 
attgaggata 
tttatatagt 
tagtttttta 
tgtggtttat 
ttgtatgttg 
gtttgagtta 
aggtgttggt 
gttgagattg 
tatggatgag 
tatagagttg 
ggtttagttg 
gaattgtggt 
ttatttggtt 
gatgttatag 
atatatgtta 
gatattaggg 
ttgttttggt 
agatttgtgg 
gtaaaataaa 
tttttatttt 
attgtttttt 
ttttttggta 
aaataattaa 
ggtatattaa 
ttgttttttt 
tgtttggttt 
gttgtaattt 
tgtattggag 
tatatttttg 
tttttttttt 
tatttggttt 
tgtggttata 
tgaaggattt 
ttatttaaat 
gagagttaat 
ttggatagtt 
ttttatgttt 
tattttggtt 

gggggtttta 

ggatgttgtt 
ttttttattg 
ttgttaattt 
atgattataa 
aagataaata 
ttagtagttg 
tgggagagat 
agggaaaaag 
gtttttggtg 
tttggtaggg 
tgtaggggat 
ggtgttgttt 



tttttgggaa 
attaaattta 
tttgtttagg 
agaatttaga 
ttatatttgt 
gtatatagta 
tttatatatt 
atatattttt 
taaagtttaa 
atttgatatt 
taaaaaataa 
gtatgtgttt 
tttttttgaa 
tttttttttt 
agggtggagt 
atgggttggt 
gtgtagttga 
tatgtgtttt 
attgtggaat 
gagtttggga 
atagagttgg 
aagaagggga 
ttgtaggaat 
ttgaagttat 
tgatgatttt 
tgtagattat 
gtgtatttat 
ttagttttta 
ttgtgagttg 
atggtttgat 
tgtggggtag 
aaagttatag 
tttttatgta 
tgtgtttttt 
ataaagagta 
attttgggaa 
tttattttaa 
tgtaggtggt 
tgattttaat 
agaggtttgg 
ttgggtttag 
attttttttt 
gataagtagt 
aagtattagg 
tagaaaattt 
tttgtttagg 
gaaagattga 
aaattaaaat 
tgtataaatg 
tttatttttt 
ttttttttat 
tgatttgttt 
attagtataa 
ataaagtata 
tttagaagag 
aattaagtag 
tgtattaaaa 
ttgaatggtt 
ttgggttagg 
ggtttttttt 
attttattag 
tgtttatgta 

tgggagtttg 



tgtttttttt 
aataaggatt 
aagtagtgat 
ttaatttgta 
tgaaaagtaa 
gttttagaaa 
ttttataata 
ttgtttgtaa 
tttttaaaaa 
ttgatatata 
taagtttata 
ttgtaattgt 
attaaaggag 
tttttttttt 
tgttttattt 
ttattgtata 
agttggagtg 
tataaatata 
ttagtgatgg 
tgtttgttat 
gtggggaaaa 
agtttttggt 
agtttgggta 
ttttgtatag 
tgaattaaat 
taaaattttt 
atattttggt 
gttattggtt 
ggatttggag 
tatttagtta 
taggtttttt 
atttatggat 
attttttttt 
taatagggat 
aaagattatt 
tttggaaagg 
gtttattata 
tatttaggtt 
ttatattggg 
tttgagatat 
ttgtgtaatt 
ttttttgtat 
tgttttggaa 
gtggagttgt 
atagattttg 
tgagaagagt 
ttttgtttaa 
tatttttaat 
ttatttagag 
ttattgttgt 
ttagagtatt 
gattttgttt 
attaggatgt 
gatttgttat 

ggggtgtgag 

aaaagttttt 
ttttgttttt 
gaaataattg 
gttgggataa 
gttaaaggtt 
ttttgttttg 
gtttagtttg 
agttagggtg 



ttttttttta 
gttgttttgt 
tgagatgttt 
tgtagtatat 
aataataata 
tattattgta 
ataataatat 
gtgaaaaatt 
tattgttgtg 
tatatttttt 
gtaatgttaa 
tagaaatatt 
tttttttttt 
ttagtgttta 
ggtaaggttt 
ttggtaagta 
ttgttttgtt 
tggaggagtt 
gttatttagg 
tgttgagggt 
tatgttttgt 
ggatagggag 
tatgttgttg 
tttggtttgt 
ttgggggtgg 
attggaggtt 
tttgttttat 
tgggttttta 
ttatgttata 
ggtgtttttt 
taattttttt 
gaggagaagg 
tttttttttg 
taggttgtta 
tttttgttag 
gtagttttgg 
ggtttgtttg 
ggttggtttt 
tttttttgtt 
ttgagttgat 
tgtttttttt 
ttgattgtta 
ggagaggtag 
ggaatagtgg 
tttttaatta 
atgtgagagg 
aattatgttg 
ttttttttgg 
agtgttttta 
tttttttttt 
taattttttt 
tttgttttta 
tgttattggg 
aaagttaagg 
ttttaatttt 
tttttttttt 
ggagattata 
taaagaaggt 
aggtttttta 
tataggttgg 
gtaattgtaa 
tgagattgtg 
gtagttttgt 



gtttttgatg 
atgtttaatt 
tggttaagtt 
tttattttta 



ttgtatgaaa 
tttattagag 
gttagaaata 
ttaattgttt 
atattattaa 
tagttattta 
aatgttatat 
tgtagtgaaa 
gatttagtgg 
tgatttttta 
taggattttg 
tttaggttgg 
tttagtttta 
ggtggtgtgt 
ttgtttaagt 
attatgttgg 
gatgataggg 
gtggatgtaa 
tagggttgta 
tggttgtgtt 
ttggggtaag 
aatttttgtt 
tgtattgttt 
ttaagtgtag 
aattttagtt 
taggtttaaa 
tggggtattt 
ggttggaagg 
atttttattg 
atattttttt 
tttgttaatt 
agattttttt 
tgtgttaggt 
tatttgtgtt 
gttttagatg 
attagtgatg 
tttttttttt 
taaaaagtat 
ttgtaaggag 
gtggggtggg 
tttgttattt 
tttgtttttt 
tttaggattt 
ttttttattt 
ttttttttga 
ttgtttgttt 
tttttaatag 
gaattttaat 
ttatttattt 
taagtttttt 
tagagtttaa 
tttttttttt 



aaattaggaa 
gtaagaagtg 
gggagggtta 
agtttgtttg 
gttaggaata 
ggatggtggg 
tggttttgga 



aatggttatt 
aggtaggtag 
agtgatagag 
tgaaataaaa 
tgttatatat 
aaattatttt 
tatagtgtgg 
ttgtataata 
taattgtttt 
aatgttaata 
ttagttaaat 
gatgttagat 
tttttttttt 
gtataatttt 
gtgttgtggg 
aggttgggtt 
ggttggttag 
aaggataggt 
tatttaggtt 
atgatatgtt 
ggttgatgtt 
ggtttttggt 
tgagtttatt 
tttttttttt 
ggagtaaata 
tttttttgat 
tgtatattaa 
gagatttggt 
gaatgtatgg 
aatatttaat 
tgtaaatttg 
gtattagagg 
ttgtttttta 
ttgtttagta 
ttaggagttt 
tttttttgtt 
ttagttgggt 
tggtggttta 
gtggtttttg 
ttttatatta 
ttatttttga 
gttttatttt 
agtttagtgt 

^gggtgtttt 

ttattttagg 
tgatgtgtaa 
agttttggtt 
attagtatag 
tttgggagag 
tgttttaatt 
taaagttttt 
aaatttggaa 
gagtttattt 
ttataaaatt 
tttttgagag 
tttaagagga 
atagggtgtg 
tgagtttagg 
atttttttgt 
tggtttttgg 
taatgttttg 
gtagtgagtt 

g^gggaattg 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
24 60 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
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taattttgta 
gtttttttag 
taatattttt 
tgtaaggttg 
ttggaaagtg 
ttgttgtttt 
tttatttttt 
tggatgttaa 
gagggtgtgt 
taagtgtggg 
atgtttattt 
tatatttttg 
tttttttttt 
tgtggggatt 
gtttttttgt 
ttgggagttg 
gaattaagat 
gagagggtgg 
aattaaaggg 
tgaatgatta 
agtgggtggt 
tggtttttgg 
ttagtgattg 
gataggaaga 
gtgggttttt 
tgttttgttg 
tttttttgtt 
tttttgtttt 
atgaagatat 
ttaagtttta 
taagaatttt 
gtggtagttt 
tttgtggtgg 
gtgtatatgt 
taggtattta 
attttaaaat 
agtttttttt 
tattaattag 
tgtggggttg 
atttagtttt 
atgaggttag 
taaggattaa 
ttatataaaa 
tttttttgat 
agttttaatt 
aagtaaatgg 
tttattttgg 
attagttttt 
tgttgaaagt 
attttttgta 
ttttttgttt 
tttgagttaa 
taaataatta 
gttgttttga 
aagtttagaa 
ttatttttgg 
aaattaaaga 
agtataggtt 
tttattatgg 
gatttgtaat 
— attrgggaa-at 
atttatgtgt 
tttaaatttg 



attaggttgt 
gtggtgttgt 
tttttatttt 
gtttatatag 
gtttttagtt 
tttttagatg 
tttgaaaatg 
ttggattggt 
gtggtggttt 
tgtttgtgtt 
tattgtagtg 
tgtagatata 
ttttgtagaa 
tggttgggta 
gaagggtatt 
aaagttgaga 
gttgggattt 
tagaattgtt 
atgtggggta 
tttttattat 



gtagagtagt 
gttggaggtg 
gggttgattg 
tgaggagatg 
tatgtagttt 
agtttttgta 
gtggggagag 
ttttggaagg 
gggatgtggg 
aagtgaaata 
tttaataagg 
tgatagagaa 
tagtagttgt 
atattttttg 
ggtgagtgat 
tagaggtggg 
gtttttgttt 
gatgttttga 
ggttattgtt 
tgtttagttt 
gagttgaagg 
taagatattt 
aggagatatg 
tttaaatttt 
gtaaagaaga 
atttgttaaa 
atttatataa 
agatagaaaa 
ttatttattt 
tattgtatta 
attttattat 
ttatatatga 
attttaagga 
gttattaaag 
ttgtgtttaa 
agatattaat 
aagaaattta 
agttggagag 
agttagtgtt 
gttagtatga 
-t-tat-fefegatg^ 
tttttttttg 
tagtttaaag 



tgtgaggttt 
tttttatatt 
ttttagattt 
tgattttttt 
tttggagttg 
ggtttttggt 
aaatatatat 
ggtagaagtt 
tgggttgtga 
ttattttttt 
gtatatttat 
ttaaattttt 
agtttagatt 
ttgaagtttg 
tttgagtggt 
ggaaaatagg 
ttgtgattta 
tgtgttttta 
gttaaaattt 
gttttttttt 
tgagtgggaa 
ttggagatgg 
ggagttagaa 
gttgatagtt 
atggatgagg 
gttgttgttg 
ttaggggatg 
tttaagtgaa 
tagaagggta 
agagtgggta 
aaagttaatg 
gtgttaagag 
tgtagttatg 
ggtggttgaa 
ggattagatt 
ttttttggtg 
ttttttagat 
gttgtggtgg 
ttgttgtgtg 
aaggtgattt 
tttgggagaa 
attttttgtg 
ttatttaaaa 



tttgtttttg 
tttttttggt 
tgtgttggtt 
gtgtgtggat 
ttggttggta 
gtttatgttt 
atttttttgt 
gtggaagagt 
ggagtgttgt 
ttttttttta 



ttaaatagt-t 
tagagttttg 
agaaaattat 
gaataaaaag 
tatataatag 
tatttaatgt 
agtttgtttt 
taattagatt 
aatatgtttt 
taatttttaa 
tttaagtagg 
ttagtaaaag 
tttttatagt 
aatatattta 
gataaattaa 
attatttttg 
agtattttta 
-fe gg a a t aa a g 
ttttgattat 
atgtatatga 



ttttatagtt 
tgggatgtgt 
tttatgtggt 
ttggtttttt 
tttaggtaat 
gatagaggtt 
ggaaatagaa 
gtgttttagg 
tggtttttgg 
ggaggggttg 
tgtttgtagg 
tgtgtatttt 
ttgaagttat 
tggtttttgt 
gagtgtgatg 
tttgttttgg 
taatttttgt 
aaagtttgga 
ttatttagag 
aagatttttt 
ttgattgtgt 
tgatagggat 
atgtggtttt 
taggagttgg 
tgtggttttg 
ttgagatgtt 
ttttttttgg 
agggattgtt 
tattgggttt 
ggtttttagt 
gagagtggaa 
ttttattata 
ttagaaaatt 
tgttaagtga 
aaaaggtagg 
tttgatttta 
ttgttttaga 
aagagaaatt 
tgattaagat 
aatttgagag 
taaatttata 



taaagttgtg 

ttatttggtt 

tttatttgga 

atgtttgggt 

aagtgagttt 

ttattttttt 

tagtatgttt 

taaattattt 
^ ^ ^ 

gtttgtgggg 
gtgttgtatg 
tgtgttttta 

atatgtgtgt 
ttgggaaggt 
tttaaaaaaa 



aatgaatttt 
tagttatttt 
tagaaggggt 
taaaatttta 
attgttttta 
aaataatttt 
ttttttttgg 
aatgtgtatt 
aaatattttt 
-tg ga.tgaa.g-t_ 
ttttaaattt 
gaattgtttt 



tttttaatga 
ggtggttttt 
gggaggttag 
agttgggttg 
aagttttgtg 
atttttttgg 
gtggtgtggt 
tagtttgtgt 
ggttaatggt 
tgtttggtgt 
ttttattagt 
gttgtgtttt 
tgagttttta 
gatggaaagt 
tgtttttagg 
tttttttttt 
tttgtttgtt 
aggtaggtga 
ttgagtgtat 
gttttgttgt 
tgtttttttg 
attttgttgt 
gtgtgattga 
ttgtttgtat 
atataggtaa 
atgaatttaa 
tggttaagaa 
tttattttta 
tgaaaaatag 
atgttgtgtt 
ttaataaatt 
aatgaataat 
ttatgttttt 
ttaatttagt 
atatatttta 
agttatgttt 
aagtttgtag 
tatataatta 
ttttttttag 
gtgtgtgagt 
tttatataaa 



agtaaattta 
tgaaagtttt 
gtttttgtat 
tgtttgatat 

tttttgtttt 
_agat.tiata.a.a 
tatttgtaat 
ttagtttttt 



ttttattggt 
tgtattttta 
tttgggtttt 
agtggttttt 
gttgttgttt 
tttgttggtt 
atttgtaatg 
ggtgttggag 
tgtgtaggtg 
ttttatttat 
agtatattta 
ggtttataga 
taggaaaaga 
aaaaaaaaat 
gtggagtttt 
gaaggttttt 
ggtatgaata 
gttgagggag 
gggagttagg 
ggtgagaggg 
gttttatttg 
ttggaggagt 
tggggatggt 
tttaagtgaa 
ttttggttat 
aggtgtggag 
agttaagttg 
tagtgggtaa 
gagtaggttt 
tttttttttt 
ttttttttat 
tattagattt 
tttttgtaat 
agtttagttt 
ttggtttaat 
ggttttttt-t 
tgtggttttg 
ttattagtag 
gtttttggga 
aggtgaagag 
gagaaaggta 
attttttatt 
taataaatta 
aatttgaaga 
agaaattgag 
tgtttggtgg 
tgtgatgttt 
gtttttaaaa 
gattttttaa 
taaatttgat 
aattaataat 
agaatgttgt 
tgagtttaag 
taagggtgaa 
ataaaaaaaa 
atttttaaag 
tttgtttttt 
gggtgattgt 
tatagttaat 
tgtttataag 
._t a t a 1 1 t,g.t a_ 
ttttttttat 
tttattagta 



3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4 920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7 560 

7620 

7680 



-78- 



ttattttatt ttaagaataa tttagttgta agggaggaat ttttttatag taagttttaa 7740 

attagtattt ttgtttttaa ttttttattt tattttattt tattttatat atatagatat 7800 

ttgtttagag taaaatatat ttttatgtga taggtttgta ttagttgagg tttatatatt 78 60 

tagttatatt aggttttgta attttattat taaattatat atattatatt agtagtttgt 7920 

tggtaaagaa ggttaaatta atttatattt tgtttattat ttggtgttta aatgatgtat 7 980 

tttattttgg agatttggtg gagaattttt tttttagatt ttatagtgtt ttattgaaga 804 0 

taatgttttt atatttgtag tggtttttaa tttgataaga ttttaatttg tttaagtttt 8100 

ttaaataagg gttttaaatg tttttagttg tttttttatt gaattttttt taattttttt 8160 

aagattataa agtatatgtg taaagtaaat attttttttt attgtattgt tagttgatga 8220 

tttataatta agttaataag aatttagttt ttttttgttg aatgtgttta ttaattatat 8280 

tttagttttt ttttttaaat tttagaatag ttgtggtttt tataatatta tgttttttaa 8340 

agttttattt tatgaaggga ttttattata ttaaagaatg aaaaaaattt ttattgtagt 8400 

tagtatatat agttttttat tttttgtttt ttaagattta aattttagag ttgtaaatat 84 60 

ttttggaagt ttgggtgtta atgttttatt ttagaaagtt gagaagtttt atagagttat 8520 

atagattttt aaatttattt tttataaatt tatagaattt tgataaaagt tttggtggtt 8580 

ttattttatt gatggaattt ttattatgat aaatatatat gtatgaagga ttttaattag 8 640 

tttttaaagt ggttgaaaaa tttaagggta tgtgattgtt ttttatagtg ttaatgtgtg 8700 

tgagatgttg gaagtattgg ggattagtag tagtttagat gtttaaaaag ataaggtgtt 8760 

ttaatttgtg tggatttatt gaagttaagt ggtgaataaa gataattatt tagataattt 8820 

agattaaagt aaaagtaaaa ttatatttat ttgtatatat atatttatat ttattttata 8880 

ttatagatat atatatgtat atatatattg gttttgtaaa taattgattt aaagtgagga 8 94 0 

ttttttttgt atttttttag taggagtttt aatatttttt taatttttta attattttat 9000 

^ 9001 

<210> 5 
<211> 9001 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 

<400> 5 

tgtaaagtga ttaggagatt aggagaatgt tgaaattttt gttggaaaaa tgtaaagaaa 60 

atttttattt tgagttagtt gtttatagag ttagtgtgtg tgtgtgtgtg tgtgtttgta 120 

atataaaatg gatgtgaata tatatatata aatagatatg gttttgtttt tattttaatt 180 

tgaattattt agataattgt ttttatttat tatttgattt taatgggttt atataaatta 240 

ggatatttta tttttttagg tatttaggtt gttgttgatt tttagtgttt ttaatatttt 300 

gtatatgttg gtattatgag gagtagttat gtgtttttgg gttttttaat tattttggag 360 

gttgattgag gttttttata tatgtatatt tgttgtgatg aaagttttat tggtagagtg 420 

gagttattag agtttttatt aaaattttgt gggtttatga gagatgggtt tagaaattta 4 80 

tatggttttg tggggttttt tggtttttta aaataaggta ttaatattta agtttttaaa 54 0 

aatatttgta gttttggggt ttgaattttg aaaaataagg agtgaggggt tgtgtatatt 600 

aattatagtg gagatttttt ttatttttta atgtgatgga gtttttttat gaaatgaagt 660 

tttaaggggt atggtattgt ggggattata gttattttga ggtttaaaag aagaaattgg 720 

aatatgatta gtaaatatat ttagtagaaa agagttggat ttttattgat ttagttatag 7 80 

gttattggtt ggtagtgtaa tgggaggaaa tatttatttt atatatatat tttatgattt 84 0 

tgggggaatt agaggaaatt taataagaaa atggttagaa atatttaaaa tttttattta 900 

aaagatttaa gtaaattaga gttttattag attaaaaatt attataaatg taagagtatt 960 

gtttttagtg aaatgttgtg gggtttgaga aggagatttt ttgttaaatt tttgggataa 1020 

aatgtgttat ttaagtatta gataatgagt agaatgtaaa ttaatttaat tttttttatt 1080 

aataggttgt tagtgtaatg tgtataattt agtgataaga ttgtaggatt taatatagtt 114 0 

ggatgtatga gttttagtta atgtagattt gttatatgag gatgtgtttt attttgagta 1200 

ggtgtttgta tgtgtggaat ggggtaaagt ggaataaaag gttaaaagta gaaatgttga 12 60 

tttaaagttt attatgaaga aatttttttt ttgtagttaa attattttta aagtgggatg 1320 

atattggtga agaaagattg aaaaataatt tttatgtgtg tttttggatt gtaagtttaa 1380 

aatggggagg agttgtagat agggtttggg ggtggttagg gtaaaggaga gatatataag 14 4 0 

ttgtaaatat atttgtagtt tgttttattt attttgtttt atattgaata agttttttaa 1500 

ttttgtgaat aaggataagg agggagtgtt ttaaagatat tttatgttgg tattgtaaat 15 60 

tattgattgt aatgttaaat aaatatatat ttagagatga taatattaat tttatagtaa 1620 

aataattgtt tatgtagaaa tttagaggag attagtttgt tttttttagt tgatttatgt 1680 

tgggggataa aaggattttt aaaaattatt ttgaatatgt ttggattttt ttttttaatt 1740 
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tttttggaaa ttaaatttgt ttggaaatag tgttataaag agttgatgtt tttaaaggtg 1800 

attttttttg ttttatataa ataaggtttt gtttttgtta gttgagtgta gttttaggtt 1860 

ttttgttttt agtttatata tatttttttt gtttgtttgg attttaatgg tttaagatag 1920 

ttttgagttt attgggaaaa gaaaatgatt gttaaaaatt atttttgaaa ttggttattt 1980 

ggtaatattt ttaattgtat ggaaatttat taaggtatat tttatatata attagtttaa 2040 

ggttgttgat tttataggtt ttatggattt aaatttgatt gataataaag taaataagag 2100 

agttgaattt aaagtgtggt ttttttgggt taggatgagt ttaatatagt gtataaggaa 2160 

tttgaaagat ttaggatatg tgttttaatt aatgttaagt agaatggata agtttttagt 2220 

attttgaaaa tgttgggtta gggttttttt tttattgtgt gttttttgtt tggggattaa 2280 

taagtattat agagaatgtg atttgaggtg attttttatt tttgtataaa tttagagtga 2340 

attattaaat agttgtttgt ttaaagttaa ggtaattttt ttttgatggg tttatttgtt 24 00 

ttttgatttt taatttatta gtttgttttt ttagggtttt gttttttttg taattaaagt 24 60 

ttttttagat tagtgtagta tttatttgat aggttgtttg gaaaatttaa gattggagag 2520 

gtgatttgtt gttgtttttt aaatttttta gttttaagta atgtgttttt tttttatatg 2580 

gggtggggga ttggaaatgg atgtagtgag atataaagag tgggtgtttt gttgattttt 2640 

gtattttttt ttttttgatt attttatttt ttttttttaa gtttttgatt tttagtttta 2700 

tttttttatt tttgggtttg tattaaaagt tggattgttt tgggttgggt aggagttgaa 27 60 

tttttgggag tttgtttgtg tagatttagt gtgtatggtg aggtagtagt ttggttttgt 2820 

attgttgata ggtgtaggta ggatagtttt tttattgtgg tttggggtgt tttgattggt 2880 

gtggagttat gttagttgta tttggagaag ggtttgggag gaggtggagg tggagagggt 2940 

tggggagggt tgtggtggag tgatgttttg gtattaggaa gtttgttttt ggttttaaga 3000 

tgttaggtta atagggaagt gtggagttgt agatttggtt tgttgtttgt ttgggtgttt 3060 

ggagttgagt tgtggtaagg tttggttttt gtttgattgt ttgaggggtg tgtgtgtgtg 3120 

tgttgtggag ggtgtgttta gagggttgtg ttgtggttgt agtggttgtt gttgttgtag 3180 

gggatttaat attatttatt tgtttttgtt atttttgata tttttttgtt agggttgttg 324 0 

tgtggggggg gggtgggtag agtgtggttg gtgttagttt tttttattgg aggggttttt 3300 

gggggaggga gggagagaag aagggggttt ttgtttattt ttgttttgtt ttggagtttg 3360 

gaagtttgtt ttttaaagat gttttgagtg gtgttttttt gtttatattt tatgtttttg 3420 

tttgtttgtt gattttttgt ttttggattt ttttgtttga gttttttgga ggagatgggg 34 8 0 

gtagtttggt ttgagaattt ggtgggggtt gtgttttttg gttttttttg tagtggggaa 354 0 

attttgtgtt tagagtgtga tttggagtgg gtagtggtgg ttatgggggt ttggtggggt 3600 

agtagttaag gattagtaga gtgttgtgtt tttttgttta tgaattgtat gaaaggtttg 3660 

ttttatttgg agtattgagt agtggggatt aagttgttgg ttgttttttt atttttttgt 3720 

tattattttt agttgttagt tatggttttg gttttggttt ttggttagtt ttggttgttg 37 80 

gattttttta agtataggtt ggaggtgtat attatttttg atatttttag tttggaggtt 3840 

gtaggtaagg tgttgtgttg ttttgtagat atttttgttt agttgttttg tgttatttgt 3900 

ttttttttgt tttaaggaag ttagtttttt tggggggagg tgtggtggga gtggttgttt 3960 

gtttggtttt ttgtagaatt tttgggagtt ggaattttga ttattttgta tttttttagt 4020 

ttttttttga ttggtttggt ttttggggtg ttaagggtgt gagtaatttt gttgtttttt 4080 

ttatttgtat tttggttttt tttttgtttt ttgggttata aaaattttag tattttgatt 414 0 

tgaggatttt tagaggttgt tgatttttgt ttttgttttt tttttggttt ttagtttttg 4200 

aggagtttta tttgttagga aattgtttga aattatttag aaatgttttt tgtgaagagg 4260 

tatttttttt ttttttttgg gaaagggttg gtgaattttg gtgtttaatt gaatttttat 4320 

attttttttt agttttttta aattgtatgg aaatttgagt tttttgtgag ggggaggggg 4380 

gtttgtaaat tatgtgtgtg tgtgtgtttt aggagatttg gtgtgtttgt gtagaggtgt 4440 

ataaatatat ttgaaagtat aggttataaa agtgaatgtg ttgttgtagt gagataaata 4 500 

tgtaaataaa atgtgtggtg ttgggggagg ggaggaaatg gggtgtggat atttatattt 4 560 

gtgtttgtat attttatagg tgtagtgttt tttgtggttt ggagttgttg tgtgtatttt 4 620 

tttttggtgt taggtagttt agttttttta tggtttttgt tgttggttta gttggtgttt 4 680 

gtgttgtagg tgggtatgtt gatgggaaag tgtgtgtgtt ttgtttttag agaaagataa 4740 

aagttagtag gggaagaatg aggatgtggg tgttgaggat ttgtttaaga agaagtggta 4 800 

aaggtggtag tggatttatt ttattagtta gtagttttag gagttggagg ttatttttta 4 8 60 

gaggaattgt tatttggata tgtttatatg tgaagaaatt gttgtgtgga ttaattttat 4 920 

ggaagtttga gtttgggtag gagttagtat ggagtttggg agggatgggg ggaggatgtt 4 980 

gtggaggtat aggttaagta gattaggaga gaatgtggaa ggtagtgttg tttgggaggg 5040 

tgttggtggg gtgtagtttt gtaaaggtag aaggttttgt ggtggtttgg ttgtgagatt 5100 

atagtttttt ttttgaggtt gataggattg ttgttttggt ttaggttttt agagtggtat 5160 

tggtttattg ttttgttatt ttgtgatttt atgagttggg ttgtatgggt aattttttgt 5220 

ataggatatt gtgtttttgg tttgtagttg ttagagtaga gttaataaaa tttttattag 5280 

gttaagagtt gtgaataggt tttaatttgt gagtttttaa taaggaaaat ttgttagaga 534 0 

tatggaagag- -tt g-gt-ttiitt^--t-tgggaaatt,_tJitgtt-^ 5.4 0 0 ._ 

ttttgggttt gtgtttttta tatttttttt atggttgttt tggttattta ggtttttttt 54 60 

atatatttta ttttttagtt ttgtgatttt tgggagtaaa gttttaatat ataattatta 5520 
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gtttttttag 
ttttttagga 
tagttttgta 
aaaatgagtt 
attttaaatt 
tagaggtttt 
tggttagagt 
gtttttttaa 
gttatgttgg 
gagttagagt 
tttgtatatt 
gtttagggta 
gaaaatgttt 
ggtgttgggt 
gggagtgagg 
attaagaatg 
gtaatgtgaa 
ataaaagttg 
ttgggttatt 
gatttggttg 
gagtgaaaga 
taagtttttg 
ttgttaggtg 
gtgggggatg 
ttttttagtg 
gtaagtttgt 
tattaagtgt 
tttatgtgtt 
gattaggttt 
tttggtgtgt 
tgttggggga 
ttgtttgttt 



aaggagaaag 
gttgaatttt 
aaaagggttt 
taaataaata 



gagaaagagg 

taatgggttt 

tgttaagggt 

gaatgttaat 

gagtatgttg 

tagtttgaat 

tgtttgtttt 

tttggttagt 

gaatttggtt 

atttatagtg 

gggaattatg 

agaaaaggaa 

tgtttgatat 

tgtttgattg 

ttgttaatat 

aaaagtagtt 

atgttatata 

gttatattgt 

tagaatagtt 

tgtgtataat 

gttttatttt 

tttttgttat 

tttgtttgtt 

tgatagttat 

ttataaaatt 

atggttagta 

t 

<210> 6 
<211> 22 
<212> DNA 



tgttaggatt 
gttgttaaaa 
agagtaaata 
attagaggga 
tggataaaag 
tgggttttgg 

^sgggaatgg 

gaaatggtaa 
ttttattttg 
tttttttgta 
tgtgtttttt 
aaaggaaaaa 
atattattgg 
ttgtttgggg 
ggatatgggt 
ggtttggtgt 

agggggaatt 

gagttaatga 
agaaagagtg 
atagtggagg 
tttttttagt 
aaggtgtttt 
ttttggattt 
tggttgaggt 
tttgtgtttg 
agggtggtgt 
aggtgggggt 
ttttgtttta 
gagtgtaatt 
atgtagtttt 
tttatatttg 
tttttgttat 
tttagtatgg 
aatttgaata 
tatgtgttgt 
ttgagattga 
tttaatttga 
ttgggatttt 
ttagaaggtt 
attattgaat 
tttttattat 
gatatgatat 
ttaaatgatt 
gttagtaata 
gaagtgatta 
atgtttttaa 
tttttggtgg 
attttgtata 
atgggagtaa 
taatttggtt 
tagttaaata 
ttattagaaa 
taattttgtt 
aataatatga 



aaaaaaagaa gaaagatttt tttgtttggt ttatttattt 
ggaaattgaa atttatattt ttttttttaa attataatta 
attttaattt tgtagtaaat ttgtatttta tggattggta 
atttaatagt aatgttttgg tttatgttgg ttggtggaag 
ttggaagtag aaaatagaat taagtaaatt aagtggtatt 
aaaaaaaatt aagtgttttg ggtagaaaaa ataaagtttt 
aaaagaagaa aataatgata aaaagaataa agattaaaat 
atgaagatat tttttgggtg gtatttgtgt aaggtatgag 
gttgggaaga agttgaaaat ggttttagtt taattgttta 
gtggtgtggt tttgagtaag gttagttttt tattagtttt 
gttttttatg tatttttttt gtttgagtaa agtttagatg 
gtaattaaag atagagttta tgggtttttt gggatttttt 
tttgttattt tgtagtttta ttttagtgtt ttgtagttgt 
gttgtttttt tttttagggt ggttgtttgt tgagttaagt 
atagtagttg ggtgtaaaga ggaaggggga taaaaaggaa 
gagaaaaagt ggattatatg gttgggtttg gtggagatgt 
tgttagtttg gatattttag gttaggtttt tttttaatat 
tgatagggag gtttgatgtg gattgggatt ggggttgtgg 
ggaagttggt tggtttgggt ggttgtttgt aaagttaaat 
gtggataggt ttgtggtggg tttagggtaa agaagaggta 
tttaaaatta ttttttttgg gtttttggag tttaatatgt 
gttgatgaag aggtggtttt ttgtttttta tttggttgtt 
ttggtggttt agtttttgtt aagggagtat gtattagggg 
ttagggaagg aagggaggaa ttgtgtggga gaaagagtga 
tttttttttt tatttgtggg tttgtggttt tggaatggaa 
gggaagggtt ggaaaagttt gttgttttgt gtttgtttta 
ggagaaatgt ttggttgagt gattaaattg tttgtaggtt 
ttgtggtgta gttttgagtt ttagtttgta ggttagagta 
gtggagattt gggttagtaa ttgaaagttg gttttggtat 
agtgaagtga ggttagggtg tgtgagtgtg ttagtgtgtg 
tggtttttga tggaagtttt agtaatttgt attgtggtat 
attgttttta ggtttggttt aagaattgtt gggttaaatg 
agtaggttga gttatgtaag aatggttttg ggttgtagtt 
atgatgatat gtatttaggt tatttttata ataattgggt 
tttttttatt tattaagagt tttttttttt ttaattttat 
tatagagtat gtttttttta tttaatttta ttttgtttat 
tgtttttagt agtgataggt gttttgggtt ttagttttaa 
atttgagtag tttgttgttg aattttgtgg tgttgatgtt 
tgattttttt gtatgtttat agggatatgt gtaatttgag 
aagtaaagta gtattttagt tttggttatg ttagtgtgta 
gtgtttgtta gtatgtagtg gattggtttg tgtgagttgt 
aggattttgt tggatggggt aattttgttt ttgaaagatt 
gtgggtatta aagaaaggga gagaaagaga agttatatag 
taaagagaga gtttttttga ttttaaaggg atgtttttag 
aagtattttt aatagttgta aggatatata tataaataaa 
tttaatatta ttataagttt gttatttttt aagtttagta 
gaaaggatgt atatatattg aaatgttaaa ttaattttat 
ttataatagt gtttttaaag gttaggtttt aaaataaagt 
ggattttttg tttgtgagta agggagtgta tatattaaat 
tatattatta ttattataaa aaatgtgtga atattagttt 
atgtaatgat gtttttgaaa ttgttatgta taatttattt 
atattattgt tttatttttt agtaaatatg aaataaatgt 
aatatattgt atataaattg gtttggattt tttttttttt 
aggatatttt agttattgtt ttttaaataa attagttttt 
tataaggtag tagtttttat ttaaatttgg tagaaataaa 
ttaaaaagaa aaaaaaaggt atttttgggg gggaaaaggg 
tttttaattt ttttttggtt taaatttaga ggattttatt 
aaaagaaaaa agaagaaaga aatttagtaa gtttattagt 



5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 ' 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9001 



<213> Artificial Sequence 
<220> 

<223> PRIMER 
<400> 6 

gtaggggagg gaagtagatg tt 

<210> 7 
<211> 24 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> PRIMER 
<400> 7 

ttctaatcct cctttccaca ataa 

<210> 8 
<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PROBE 
<400> 8 

agtcggagtc gggagagcga 

<210> 9 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PROBE 
<400> 9 

agttggagtt gggagagtga aaggaga 

<210> 10 
<211> 25 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> PRIMER CONTROL 
<400> 10 

tggtgatgga ggaggtttag taagt 

<210> 11 
<211> 27 
<212> DNA 

<"2T3">~AftiTicxar~Seq^ 



<220> 
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<223> PRIMER CONTROL 



<400> 11 



aaccaataaa acctactcct cccttaa 



27 



<210> 12 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PROBE CONTROL 
<400> 12 

accaccaccc aacacacaat aacaaacaca 30 

<210> 13 
<211> 408 

<212> DNA 

<213> Homo Sapiens 



<210> 14 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 14 

aacatctact tccctcccct ac 22 

<210> 15 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<4ao> 15 

gttagtagag attttattaa attttattgt at 32 

<210> 16 
<211> 14 
<212> DNA 

<213> Artificial Sequence 



<400> 13 



tcctcaactc 
cgcctcgtcc 
cctgtcgctt 
gctgccctgg 
tcttctcttc 
ctgctcctcg 
tccggctccc 



tgcaggcctg 
caaacaaccc 
cctcccagcc 
cgctccccct 
catcccatcc 
gttggctcct 
gactcttcgg 



aaagaaggtc acacacgcac gctcacaccc acactccaca 
catgaacatt gtcctttgtt ccgtctcttg ggccactttc 

cgtcctgatt tgctccccaa aagtacgttt ctgtctcccc 

ttgatttatt agggctgccg ggttggcgca gattgctttt 

tcccttctgg tcctcctttc cacagtggga gtccgtgctc 

aagtgccccg ccaggtcccc tctcctttcg ctctcccggc 
cccgctggca tctgcttccc tcccctgc 



120 
180 
240 
300 
360 
408 



60 



<220> 
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<223> chemically treated genomic DNA (Homo sapiens) 

<400> 16 

ttcggttgcg cggt 

<210> 17 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 17 

tttggttgtg tggttg 

<210> 18 

<211> 144 

<212> DNA 

<213> Homo Sapiens 

<400> 18 

ttctggtcct cctttccaca gtgggagtcc gtgctcctgc tcctcggttg gctcctaagt 
gccccgccag gtcccctctc ctttcgctct cccggctccg gctcccgact cttcggcccg 
ctggcatctg cttccctccc ctgc 

<210> 19 
<211> 162 

<212> DNA 

<213> Homo Sapiens 

<400> 19 

tggcatctgc ttccctcccc tgcctcgttt ctcgtcgccc ctgctcgctc cccccggcgc 

tcgcccgggc gctgtgctcg ctcctggatc gccagccgcg cagcgggctc gccggcgccc 

gcgcgccact gtgcagtgga gtttggtgga atctctgctg ac 

<210> 20 
<211> 2235 
<212> DNA 
<213> Homo Sapiens 

<400> 20 

tgggagtccg tgctcctgct cctcggttgg 

tttcgctctc ccggctccgg ctcccgactc 

tgcctcgttt ctcgtcgccc ctgctcgctc 

ctcctggatc gccagccgcg cagccgggct 

ggagtttggt ggaatctctg ctgacgtcac 

gaagagagag ggatgagagg gagggagagg 

ggagaggagc agaaagaaac tgccagtggc 

tggactcctt cggaacttgg caccctcagg 

gggcgcttgc cgtgcagccg gaggctcggc 

acgcggagac agcagctctc tcccggtagc 

aaactggtgt cggcgtgtgt gcaattaggc 

tccaaagact ccgaaatcaa aaaggtcgag 

gcagccagca gcaagttctt cccgcggcag 
-ea^oagg^ga-- a^at-gagga: "Cgtg^gegec 

cggcagcgga ctcactttac cagccagcag 

aaccgctacc cggacatgtc cacacgcgaa 



ctcctaagtg ccccgccagg tcccctctcc 

ttcggcccgc tggcatctgc ttccctcccc 

cccccggcgc tcgcccgggc gctgtgctcg 

cggccggccg cccgcgcgcc actgtgcagt 

gtcactcccc acacggagta ggagcagagg 

agagagagtg cgagaccgag cgagaaagct 

ggctagattt cggaggcccc agtgcacccg 

agccctgcag tcctctcagg cccggctttc 

tcgctggaaa tcgccccggg aagcagtggg 

cgataacggg gaatggagac caactgccgc 

gtgcagccgg cggccgttga atgtctcttc 

ttcacggact ctcctgagag ccgaaaagag 

catcctggcg ccaatgagaa agataaaagc 

-g^ggacGGg-t - etaagaagaa - geggcaaagg 

ctccaggagc tggaggccac tttccagagg 

gaaatcgctg tgtggaccaa ccttacggaa 
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gcccgagtcc gggtttggtt caagaatcgt cgggccaaat ggagaaagag ggagcgcaac 1020 

cagcaggccg agctatgcaa gaatggcttc gggccgcagt tcaatgggct catgcagccc 1080 

tacgacgaca tgtacccagg ctattcctac aacaactggg ccgccaaggg ccttacatcc 1140 

gcctccctat ccaccaagag cttccccttc ttcaactcta tgaacgtcaa ccccctgtca 1200 

tcacagagca tgttttcccc acccaactct atctcgtcca tgagcatgtc gtccagcatg 1260 

gtgccctcag cagtgacagg cgtcccgggc tccagtctca acagcctgaa taacttgaac 1320 

aacctgagta gcccgtcgct gaattccgcg gtgccgacgc ctgcctgtcc ttacgcgccg 1380 

ccgactcctc cgtatgttta tagggacacg tgtaactcga gcctggccag cctgagactg 14 40 

aaagcaaagc agcactccag cttcggctac gccagcgtgc agaacccggc ctccaacctg 1500 

agtgcttgcc agtatgcagt ggaccggccc gtgtgagccg cacccacagc gccgggatcc 1560 

taggaccttg ccggatgggg caactccgcc cttgaaagac tgggaattat gctagaaggt 1620 

cgtgggcact aaagaaaggg agagaaagag aagctatata gagaaaagga aaccactgaa 1680 

tcaaagagag agctcctttg atttcaaagg gatgtcctca gtgtctgaca tctttcacta 1740 

caagtatttc taacagttgc aaggacacat acacaaacaa atgtttgact ggatatgaca 1800 

ttttaacatt actataagct tgttattttt taagtttagc attgttaaca tttaaatgac 18 60 

tgaaaggatg tatatatatc gaaatgtcaa attaatttta taaaagcagt tgttagtaat 1920 

atcacaacag tgtttttaaa ggttaggctt taaaataaag catgttatac agaagcgatt 1980 

aggatttttc gcttgcgagc aagggagtgt atatactaaa tgccacactg tatgtttcta 2040 

acatattatt attattataa aaaatgtgtg aatatcagtt ttagaatagt ttctctggtg 2100 

gatgcaatga tgtttctgaa actgctatgt acaacctacc ctgtgtataa catttcgtac 2160 

aatattattg ttttactttt cagcaaatat gaaacaaatg tgttttattt catgggagta 2220 

aaatatactg catac 2235 

<210> 21 
<211> 

<212> protein 

<213> polypeptide Sequence 

<220> 

<223> amino acid sequence (Homo sapiens) 
<400> 450 

METNCRKLVSACVQLGVQPAAVECLFSKDSEIKKVEFTDSPESRKEAASSKFFP 
RQHPGANEKDKSQQGKNEDVGAEDPSKKKRQRRQRTHFTSQQLQELEATFQRNR 
YPDMSTREEIAVWTNLTEARVRVWFKNREIAKWRKRERNQQAELCKNGFGPQFNG 
LMQPYDDMYPGYSYNNWAAKGLTSASLSTKSFPFFNSMNVNPLSSQSMFSPPNS 
ISSMSMSSSMVPSAVTGVPGSSLNSLNNLNNLSSPSLNSAVPTPACPYAPPTPP 

YVYRDTCNSSLASLRLKAKQHSSFGYASVQNPASNLSACQYAVDRPV 450 

<210> 22 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 22 

gtaggggagg gaagtagatg t 2i 

<210> 23 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 23 



tcctcaactc tacaaaccta aaa 



23 
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<210> 24 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 24 

agtcgggaga gcgaaa 

<210> 25 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA {Homo sapiens) 

<400> 25 

agttgggaga gtgaaa 

<210> 26 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 26 

a^gagtcggg agtcgga 

<210> 27 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 27 

aagagtt ggg agttgga 

<210> 28 
<211> 16 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 28 

ggtcgaagag tcggga 



<210> 29 
<211> 16 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 29 

ggttgaagag ttggga 

<210> 30 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 30 

atgttagcgg gtcgaa 

<210> 31 
<211> 16 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> chemically treated genomic DNA (Homo sapiens) 
<400> 31 



tagtgggttg aagagt 



